voice: ship Wave 2 voice server (root /api/voice/*, 9-tool sideband)
Stands up the solstone voice backend: a persistent sol brain (Claude CLI
session adapted from hub-phone's pattern, session persisted at
journal/health/voice-brain-session), five root-level endpoints at
/api/voice/*, a sideband dispatcher that bridges OpenAI Realtime tool
calls to the 9-tool manifest, and a per-call-id nav-hint queue the native
client polls to drive WebView routing.
Endpoints:
- POST /api/voice/session — mint OpenAI ephemeral key + current brain
instruction + 9-tool manifest; HTTP 503 if key not configured or
brain still cold after 10s.
- POST /api/voice/connect — spawn async sideband task for a call_id;
registered in app.voice_tasks and cancelled on shutdown.
- POST /api/voice/refresh-brain — force brain refresh (waits up to 30s).
- GET /api/voice/nav-hints?call_id=… — drain the per-call-id nav queue;
TTL 60s, cap 8 hints, FIFO drop.
- GET /api/voice/status — brain readiness, key-configured flag, active
session count; polled by the native client at 30s.
9-tool manifest — every handler reuses the existing solstone surface, no
new data layer:
- journal.get_day → think.cluster.scan_day / cluster_segments
- journal.search → think.indexer.journal.search_journal
- entities.get → think.surfaces.profile.full
- entities.recent_with → profile.full + think.activities.load_activity_records
- commitments.list → think.surfaces.ledger.list (strips `sources`)
- commitments.complete → think.surfaces.ledger.close
- calendar.today → load_activity_records filtered to source=anticipated
- briefing.get → apps.home.routes._load_briefing_md
- observer.start_listening → stub (Wave 4 wires the real observer)
Runtime: a singleton daemon-thread asyncio loop (think/voice/runtime.py)
with explicit lifecycle hooks. Brain start is lazy — the first
/api/voice/session call triggers it via wait_until_ready, so process
startup is cheap and tests need no special warm-up.
Config: journal/config/journal.json gains a voice block:
voice.openai_api_key, voice.model (default "gpt-realtime"),
voice.brain_model (default "haiku"). OPENAI_API_KEY env is the
fallback for the key.
Nav hints are strictly side-channel — the sideband strips _nav_target
from the OpenAI payload and queues it per-call-id for the native client
to drain.
Design record at docs/design/voice-server.md captures the decisions and
the scope deviations that were approved at gate (brain-not-ready=503,
briefing path correction, commitments resolution mapping, OpenAI key
sourcing, ask_sol clause dropped).
New tests (8 files, all green under `make ci`):
- tests/test_voice_config.py, tests/test_voice_nav_queue.py,
tests/test_voice_brain.py, tests/test_voice_runtime.py,
tests/test_voice_tools.py, tests/test_voice_sideband.py,
tests/test_voice_routes.py, tests/test_voice_integration.py
make ci: 3584 passed, 1 skipped.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>