Commit Graph

377 Commits

Author SHA1 Message Date
Christian Gick
475ab38f6f chore: Trigger rebuild 2026-02-22 11:09:17 +02:00
Christian Gick
e3c1ded328 feat(voice): inject datetime into prompt, respond in DE/EN
- Add VOICE_TIMEZONE env var (default: Europe/Berlin) for local time
- Bot knows exact date/time at call start via _build_voice_prompt()
- Respond in user language (DE or EN) instead of always German

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-22 11:02:56 +02:00
Christian Gick
92ab906a21 chore(voice): switch default voice to George (multilingual DE/EN)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-22 10:59:44 +02:00
Christian Gick
7696ca68ee chore: Remove debug logging and pipecat-poc after E2EE fix confirmed working
- Remove setLevel(DEBUG) for livekit.agents/plugins (added for diagnostics)
- Remove periodic E2EE cryptor/participant state poll loop (no longer needed)
- Remove pipecat-poc/pipeline.py (POC never deployed, LiveKit approach confirmed)

E2EE bidirectional voice confirmed working in MAT-36.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-22 10:55:47 +02:00
Christian Gick
ac8a8a177c chore: Trigger rebuild 2026-02-22 10:35:28 +02:00
Christian Gick
63545f032e fix(voice): set E2EE keys immediately after connect, before rotation wait
Root cause: caller track subscribed during 2s rotation wait creates a
frame cryptor with no key → DEC_FAILED state → all incoming frames dropped.
Setting the key after the wait doesn't recover the cryptor.

Fix: set bot + caller keys immediately after lk_room.connect(), using
the Matrix-provided caller identity. The post-rotation and post-find-remote
key updates remain as belt+suspenders.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-22 10:34:20 +02:00
Christian Gick
4ab5486b5c fix(voice): log remote participant identity and track count in E2EE poll
Adds REMOTE_PARTICIPANT log every 10s to confirm caller is present
and tracks are subscribed during E2EE decryption diagnosis.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-22 10:30:38 +02:00
Christian Gick
c4581c2917 fix(voice): reduce key rotation wait to 2s, increase E2EE poll to every 10s
Phase 2 diagnostics: caller audio arrives immediately; setting the key
earlier (2s vs 10s) avoids dropping initial frames. E2EE_CRYPTOR log
now fires every 10s (was 30s) to confirm decryption state for incoming
caller audio.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-22 10:27:19 +02:00
Christian Gick
5973ed1db3 fix(voice): revert to KDF_HKDF=1 with raw keys — proto value 0 is PBKDF2 not raw
e2ee_patch.py shows KDF_PBKDF2=0, KDF_HKDF=1.
Our KDF_NONE=0 was actually PBKDF2, double-deriving keys and causing silence.
Removed Python HKDF pre-derivation — let Rust FFI apply HKDF internally.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-22 09:26:44 +02:00
Christian Gick
6b457a2aef fix(voice): use correct HKDF info=128zeros, length=16 matching LiveKit JS SDK
LiveKit JS SDK deriveKeys(): info=new ArrayBuffer(128) (128 zero bytes, NOT identity), output=16 bytes AES-128.
Previous code used identity as info and 32-byte output - both wrong, caused silence in both directions.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-22 09:09:34 +02:00
Christian Gick
4f8bfbe479 fix(voice): pre-derive HKDF in Python, use KDF_NONE to bypass Rust FFI HKDF
Rust FFI's KDF_HKDF path for incoming decryption may use wrong parameters.
Pre-derive HKDF(base_key, salt="LKFrameEncryptionKey", info=identity) in Python
and pass derived key with KDF_NONE so Rust FFI uses it directly as frame key.

Matches EC's MatrixKeyProvider: ratchetWindowSize=10, keyringSize=256.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-22 08:47:41 +02:00
Christian Gick
c330900a3a fix(voice): wait for key rotation via nio sync, not HTTP fetch
io.element.call.encryption_keys events are Megolm-encrypted in this room
(appear as m.room.encrypted). The HTTP fetch cannot decrypt them — only
the nio sync client can via Olm/Megolm decryption.

Change the post-connect rotation poll to check self._caller_all_keys
directly (updated by on_encryption_key() via nio sync) instead of calling
_fetch_encryption_key_http() which always returns nothing in encrypted rooms.

Also extends wait to 10s and adds progress logging every 2s.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-22 08:23:17 +02:00
Christian Gick
cf519595d6 fix(voice): poll for EC key rotation post-connect, set all key indices
Element Call rotates its encryption key when a new participant joins the
LiveKit room. Previously the bot fetched only the pre-join key and set it
at index 0, while EC was already encrypting with the rotated key (index 1).

Changes:
- After connecting to LiveKit, poll the Matrix timeline up to 5s (10×0.5s)
  to detect the post-join key rotation
- Set ALL known caller key indices (not just 0) so the Rust FFI cryptor
  has the correct key regardless of which index EC is currently using
- Also set via caller_identity (belt+suspenders) if different from LK identity

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-22 08:20:44 +02:00
Christian Gick
8b143a2ac4 debug(e2ee): poll frame_cryptors() every 30s for state diagnosis 2026-02-22 08:14:19 +02:00
Christian Gick
630a0de970 fix(e2ee): revert to per-participant mode with proper rotation handling
The shared-key mode uses HKDF with empty info, but Element Call JS uses
participant identity as HKDF info. Per-participant mode (set_key with
identity) matches EC's derivation.

Previous per-participant attempt (b65d043) failed because key rotation
(index 0→1 when bot joins) wasn't handled. Now on_encryption_key calls
set_key(caller_id, key, index) on rotation, so the bot stays in sync.

Changes:
- _build_e2ee_options(): remove caller_key param, shared_key=b"" (per-participant mode)
- _run(): set_key(remote_identity, caller_key, 0) for incoming decryption
- on_encryption_key: only set_key() on rotation (no set_shared_key)
2026-02-22 08:10:27 +02:00
Christian Gick
295c0ed5cb debug(e2ee): decode encryption state to human-readable names 2026-02-22 08:00:28 +02:00
Christian Gick
a6236a3817 debug(e2ee): update both shared+per-participant keys on rotation 2026-02-22 07:53:40 +02:00
Christian Gick
b22c4d48e9 debug(e2ee): add e2ee_state_changed event listener for diagnostics
Log DECRYPTION_FAILED / MISSING_KEY / OK states per participant
to pinpoint exactly what the Rust FFI reports about key setup.
2026-02-22 07:50:12 +02:00
Christian Gick
a8b30418c8 debug(e2ee): verify shared key + belt-suspenders per-participant key
Add export_shared_key() verification after connect to confirm key
is stored. Also set per-participant key for caller (belt+suspenders)
so both shared-key and per-participant decryption paths are active.
2026-02-22 07:47:09 +02:00
Christian Gick
65340bf0ee fix(e2ee): use set_shared_key for live key rotation updates
When Element Call sees the bot join, it rotates its encryption key
(index 0 → 1). The on_encryption_key callback was calling set_key()
(per-participant) which has no effect in shared-key mode. Switch to
set_shared_key() so the shared-key decryption path stays current when
the caller rotates keys.
2026-02-22 07:38:06 +02:00
Christian Gick
9cf4afc928 fix(e2ee): pass caller_key as shared_key at connect time
Per-participant set_key() for remote identities doesn't work for
incoming decryption in this Rust FFI build (set_shared_key() after
connect is also ignored in per-participant mode).

Solution: initialize with caller_key as shared_key (true shared-key
mode) so the Rust FFI uses it for incoming decryption. Then override
outgoing encryption via set_key(bot_identity, bot_key) after connect.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-22 07:31:14 +02:00
Christian Gick
4875a7dc9b fix(e2ee): add set_shared_key fallback for incoming audio decryption
Rust FFI may not use per-participant key for remote participant
decryption in all code paths. Set the caller key as both per-participant
AND shared key so either path works for incoming frame decryption.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-22 07:25:26 +02:00
Christian Gick
893e07a543 fix(e2ee): set caller keys at correct indices from timeline
Element Call may rotate encryption keys to index > 0. Previously we
always called set_key(identity, key, 0) regardless of the actual index,
causing decryption to fail when the active key was at a non-zero index.

- _fetch_encryption_key_http: collect all {index->key} pairs from event
- _run: set each caller key at its correct index
- on_encryption_key: handle multiple indices, remove first-key-only gate

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-22 07:19:28 +02:00
Christian Gick
685218247a fix: Use empty bytes instead of None for shared_key
NoneType causes TypeError in patched room.py proto assignment.
Empty bytes is falsy so shared_key is not set in proto,
initializing key provider in per-participant mode.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-22 07:02:33 +02:00
Christian Gick
9ebf90c8bb fix: Use per-participant E2EE mode (no shared_key)
shared_key locks provider in shared-key mode, making set_key()
ineffective for per-participant decryption. Remove shared_key so
SDK initializes in per-participant mode. Also: failure_tolerance=-1
to prevent premature track closure on decrypt failures,
ratchet_window_size=16 to match Element Call.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-22 07:00:22 +02:00
Christian Gick
c290332a1e fix: Disable close_on_disconnect to keep session alive
E2EE key setup may briefly appear as participant disconnect.
Keep session alive to allow audio to flow once keys are settled.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-22 06:49:44 +02:00
Christian Gick
b65d04389b fix: Switch E2EE to per-participant keys instead of shared key
Element Call uses per-participant keys, not shared key mode.
Bot now generates its own key, publishes it, and sets both
keys via key_provider.set_key() after connecting.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-22 06:41:20 +02:00
Christian Gick
ced2783a09 fix: Enable E2EE with caller's key as shared key
Element Call now rejects unencrypted audio. Use caller's key
as shared_key so both sides encrypt/decrypt with the same key.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 20:51:43 +02:00
Christian Gick
4a93827de3 revert: Restore voice.py and bot.py to last known working state (9aef846)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 20:47:51 +02:00
Christian Gick
463286a61e revert: Disable E2EE at LiveKit level — shared key incompatible with Element Call
Element Call uses per-participant keys, LiveKit Python SDK shared key mode
cannot properly decrypt. Reverting to working state (no LiveKit E2EE).
Bot still publishes keys so Element Call shows encryption indicator.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 20:44:32 +02:00
Christian Gick
2d8a7b4420 fix: Use same shared key for both directions (caller key reuse)
Both bot and caller must use the same key in shared key mode.
Bot now reuses caller's key and publishes it back, instead of
generating a separate bot key.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 20:41:23 +02:00
Christian Gick
2fa13c4958 fix: Use caller key as shared_key at connect time for immediate decryption
Per-participant set_key alone with empty shared_key caused silent incoming audio.
Now connects with caller key as shared_key, then overlays per-participant keys.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 20:35:55 +02:00
Christian Gick
5f3e733ba5 feat: Voice prompt with model transparency, datetime, auto language
- Bot tells which model it uses when asked
- Injects current UTC datetime into prompt
- Responds in users language instead of always German

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 20:28:13 +02:00
Christian Gick
939324ca76 chore: Switch voice to George (warm, captivating)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 20:20:27 +02:00
Christian Gick
e3ede3fc2c fix: Voice bot identifies as Agiliton Assistant, not Claude
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 19:40:24 +02:00
Christian Gick
533847c952 fix: Switch E2EE from shared key to per-participant key mode
Element Call uses per-participant keys via MatrixKeyProvider.onSetEncryptionKey(),
not shared key mode. This was causing silence with E2EE enabled.

- Set bot's own key and caller's key separately via e2ee_manager.key_provider.set_key()
- Live-update caller key when received after connect
- Fallback to set_shared_key if per-participant API unavailable

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 18:50:19 +02:00
Christian Gick
9aef846619 fix: disable E2EE to get working voice pipeline, E2EE fix deferred
Audio pipeline confirmed working without E2EE. E2EE key derivation
mismatch with Element Call needs separate investigation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 18:30:11 +02:00
Christian Gick
67f4b8159e fix: match Element Call HKDF params (ratchetWindowSize=10, keyringSize=256)
Re-enable E2EE with corrected parameters matching Element Call defaults.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 18:22:54 +02:00
Christian Gick
96183a8ccd test: temporarily disable E2EE to isolate audio pipeline issue
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 17:38:57 +02:00
Christian Gick
1bc044eaae fix: republish caller E2EE key as shared key, fallback to no-E2EE
Bot now publishes the same key as the caller so both sides can decrypt.
Falls back to no-encryption if no caller key received.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 17:35:31 +02:00
Christian Gick
08f4e115b9 fix: filter own key events, fix RoomOptions None, wait for participant
- Skip bot own encryption_keys events in on_unknown handler
- Always pass valid RoomOptions to AgentSession.start()
- Wait up to 10s for remote participant to connect before starting pipeline

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 17:33:12 +02:00
Christian Gick
6e1e9839cc fix: use timeline events for E2EE key exchange (not state events)
Element Call distributes encryption keys as timeline events, not room
state events. Changed bot to publish keys via room_send and fetch from
/messages endpoint instead of /state.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 17:28:56 +02:00
Christian Gick
4e1e372ca2 fix: use caller's E2EE key (not own), fetch via HTTP API
All participants must use the SAME shared key. Bot was generating
its own key which couldn't decrypt user's audio. Now:
1. Fetch caller's key from room state via HTTP API
2. Fall back to waiting for key via sync handler
3. Publish the SAME key back (not a new one)
4. Only connect with E2EE if key available

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 17:17:33 +02:00
Christian Gick
753d6543d4 fix: generate and publish E2EE key, always connect with encryption
Element Call encrypts media by default. Bot must:
1. Generate its own 32-byte E2EE key
2. Publish it to room state (io.element.call.encryption_keys)
3. Connect to LiveKit with HKDF E2EE enabled
4. Use caller's key when received, own key as fallback

This fixes: Nicht verschlüsselt warning, silent audio (encrypted
frames couldn't be decoded by VAD/STT)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 17:06:34 +02:00
Christian Gick
74758a3f13 debug: enable livekit.agents debug logging for STT/VAD diagnosis
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 17:02:24 +02:00
Christian Gick
75970fc06b fix: link AgentSession to remote participant + debug speech events
- Pass participant_identity via RoomOptions so AgentSession knows
  which audio track to consume (was silently ignoring user audio)
- Add USER_SPEECH and AGENT_SPEECH event handlers for debugging
- Simplify greeting to exact text to prevent hallucination
- Use httpx for room state scan (nio API was unreliable)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 15:03:24 +02:00
Christian Gick
85df4b295f fix: E2EE key timing + verbose logging + shorter greeting
- Reorder: send call member event BEFORE creating VoiceSession
- Store VoiceSession BEFORE start so sync handler can forward keys
- Increase E2EE key wait from 3s to 10s
- Add INFO-level logging for key lookup + room state scan via HTTP API
- Tighten voice system prompt to prevent long rambling greetings

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 14:55:52 +02:00
Christian Gick
80582860b9 fix: E2EE key lookup for Element Call voice sessions
- Fix state_key format: try @user:domain:DEVICE_ID (Element Call format),
  then @user:domain, then scan all room state as fallback
- Publish bot E2EE key to room so Element shows encrypted status
- Extract caller device_id from call member event content
- Also fix pipecat-poc pipeline with context aggregators (CF-1579)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 14:51:26 +02:00
Christian Gick
c60dfc0cef chore: Trigger rebuild 2026-02-21 14:43:14 +02:00
Christian Gick
54a2180b52 chore: Trigger rebuild 2026-02-21 14:36:06 +02:00