Superpowers Brainstorming

Connected

Design summary

Full picture before we write the spec — let me know if anything needs changing

What we're building

Real-time animated talking faces for the voice-web interface, powered by Simli.ai. Each voice profile gets its own realistic face. Starting with Hermes. Faces animate in sync with any TTS engine (ElevenLabs or Piper).

UI behaviour

  • Default (PiP mode) — chat transcript fills the screen, Hermes face floats as a picture-in-picture overlay in the bottom corner
  • Multi-agent (Gallery mode) — all agent faces visible side-by-side, speaking face highlighted, others dimmed
  • Auto-switching — gallery activates automatically when a second agent starts speaking; returns to PiP when back to one agent
  • Pin override — a toggle lets you lock either mode regardless of agent count
  • Fallback — if Simli is unreachable, UI silently falls back to current audio-only experience; no broken state

New components

  • PCMDecoder — browser-side, converts MP3 (ElevenLabs) or WAV (Piper) chunks to raw 16-bit PCM at 16kHz for Simli
  • SimliClient wrapper — thin class around the Simli SDK; manages one WebRTC session per agent face, handles reconnect
  • FaceManager — owns all <video> elements, activates/deactivates faces per speaking agent, drives PiP ↔ gallery transitions
  • /api/simli/session — new endpoint on voice-web server; calls Simli API server-side so the API key never reaches the browser; returns session token + face config per agent

What doesn't change

  • voice-gateway — no changes at all
  • ElevenLabs integration, Piper fallback, engine selector — all unchanged
  • WebSocket session protocol — no new WS message types needed
  • Existing audio player — kept as fallback; bypassed when Simli is active

Phase plan

  1. Phase 1 — Hermes face in PiP mode with real-time lip sync (ElevenLabs + Piper)
  2. Phase 2 — Per-profile faces, gallery mode, auto-switch + pin
  3. Phase 3 — Custom face upload per profile (refine personas)

Does this match what you had in mind? Any changes before I write the spec?

Click an option above, then return to the terminal