Commit Graph

294 Commits

Author SHA1 Message Date
Christian Gick
939324ca76 chore: Switch voice to George (warm, captivating)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 20:20:27 +02:00
Christian Gick
e3ede3fc2c fix: Voice bot identifies as Agiliton Assistant, not Claude
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 19:40:24 +02:00
Christian Gick
533847c952 fix: Switch E2EE from shared key to per-participant key mode
Element Call uses per-participant keys via MatrixKeyProvider.onSetEncryptionKey(),
not shared key mode. This was causing silence with E2EE enabled.

- Set bot's own key and caller's key separately via e2ee_manager.key_provider.set_key()
- Live-update caller key when received after connect
- Fallback to set_shared_key if per-participant API unavailable

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 18:50:19 +02:00
Christian Gick
9aef846619 fix: disable E2EE to get working voice pipeline, E2EE fix deferred
Audio pipeline confirmed working without E2EE. E2EE key derivation
mismatch with Element Call needs separate investigation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 18:30:11 +02:00
Christian Gick
67f4b8159e fix: match Element Call HKDF params (ratchetWindowSize=10, keyringSize=256)
Re-enable E2EE with corrected parameters matching Element Call defaults.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 18:22:54 +02:00
Christian Gick
96183a8ccd test: temporarily disable E2EE to isolate audio pipeline issue
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 17:38:57 +02:00
Christian Gick
1bc044eaae fix: republish caller E2EE key as shared key, fallback to no-E2EE
Bot now publishes the same key as the caller so both sides can decrypt.
Falls back to no-encryption if no caller key received.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 17:35:31 +02:00
Christian Gick
08f4e115b9 fix: filter own key events, fix RoomOptions None, wait for participant
- Skip bot own encryption_keys events in on_unknown handler
- Always pass valid RoomOptions to AgentSession.start()
- Wait up to 10s for remote participant to connect before starting pipeline

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 17:33:12 +02:00
Christian Gick
6e1e9839cc fix: use timeline events for E2EE key exchange (not state events)
Element Call distributes encryption keys as timeline events, not room
state events. Changed bot to publish keys via room_send and fetch from
/messages endpoint instead of /state.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 17:28:56 +02:00
Christian Gick
4e1e372ca2 fix: use caller's E2EE key (not own), fetch via HTTP API
All participants must use the SAME shared key. Bot was generating
its own key which couldn't decrypt user's audio. Now:
1. Fetch caller's key from room state via HTTP API
2. Fall back to waiting for key via sync handler
3. Publish the SAME key back (not a new one)
4. Only connect with E2EE if key available

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 17:17:33 +02:00
Christian Gick
753d6543d4 fix: generate and publish E2EE key, always connect with encryption
Element Call encrypts media by default. Bot must:
1. Generate its own 32-byte E2EE key
2. Publish it to room state (io.element.call.encryption_keys)
3. Connect to LiveKit with HKDF E2EE enabled
4. Use caller's key when received, own key as fallback

This fixes: Nicht verschlüsselt warning, silent audio (encrypted
frames couldn't be decoded by VAD/STT)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 17:06:34 +02:00
Christian Gick
74758a3f13 debug: enable livekit.agents debug logging for STT/VAD diagnosis
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 17:02:24 +02:00
Christian Gick
75970fc06b fix: link AgentSession to remote participant + debug speech events
- Pass participant_identity via RoomOptions so AgentSession knows
  which audio track to consume (was silently ignoring user audio)
- Add USER_SPEECH and AGENT_SPEECH event handlers for debugging
- Simplify greeting to exact text to prevent hallucination
- Use httpx for room state scan (nio API was unreliable)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 15:03:24 +02:00
Christian Gick
85df4b295f fix: E2EE key timing + verbose logging + shorter greeting
- Reorder: send call member event BEFORE creating VoiceSession
- Store VoiceSession BEFORE start so sync handler can forward keys
- Increase E2EE key wait from 3s to 10s
- Add INFO-level logging for key lookup + room state scan via HTTP API
- Tighten voice system prompt to prevent long rambling greetings

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 14:55:52 +02:00
Christian Gick
80582860b9 fix: E2EE key lookup for Element Call voice sessions
- Fix state_key format: try @user:domain:DEVICE_ID (Element Call format),
  then @user:domain, then scan all room state as fallback
- Publish bot E2EE key to room so Element shows encrypted status
- Extract caller device_id from call member event content
- Also fix pipecat-poc pipeline with context aggregators (CF-1579)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 14:51:26 +02:00
Christian Gick
c60dfc0cef chore: Trigger rebuild 2026-02-21 14:43:14 +02:00
Christian Gick
54a2180b52 chore: Trigger rebuild 2026-02-21 14:36:06 +02:00
Christian Gick
a6b1f46116 fix: use python:3.11-slim-trixie for GLIBC 2.38 compat with patched FFI
rust:latest produces FFI needing CXXABI_1.3.15 (GCC 14 libstdc++).
GCC 14 libstdc++ needs GLIBC 2.38. Bookworm only has 2.36.
Trixie has GLIBC 2.38+ — fixes the CXXABI_1.3.15 runtime error.
Also reverts to rust:latest since bookworm GCC 12 cant compile webrtc C++20.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 18:13:50 +02:00
Christian Gick
6d99eb172e chore: Trigger rebuild 2026-02-20 18:01:58 +02:00
Christian Gick
cffedb53a3 fix: use rust:bookworm build stage for CXXABI compat with python:3.11-slim-bookworm
rust:latest links against GLIBC_2.38 libstdc++ which is incompatible with bookworm.
rust:bookworm (1.93.1) produces FFI binary compatible with bookworm libstdc++.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 18:01:42 +02:00
Christian Gick
4b8b36d463 chore: Trigger rebuild 2026-02-20 17:22:51 +02:00
Christian Gick
541a26c354 fix: revert libstdc++ copy that broke GLIBC compat
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 17:22:36 +02:00
Christian Gick
c0b212eaae chore: Trigger rebuild 2026-02-20 17:22:07 +02:00
Christian Gick
85f8df5690 fix: VoiceSession cleanup on call leave + CXXABI compat + proactive E2EE key read
- Stop VoiceSession when call leave event received
- Copy libstdc++ from rust build stage to fix CXXABI_1.3.15 mismatch
- Read caller encryption key from room state before starting VoiceSession

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 17:21:51 +02:00
Christian Gick
e5e8b56482 fix(e2ee): Add E2EE HKDF to voice.py, bot uses patched Dockerfile
voice.py runs in bot container, not agent container.
- Wait 3s for encryption key before connecting
- Build E2EE options with HKDF when key received
- Bot container now uses patched Dockerfile (needs FFI)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 17:13:53 +02:00
Christian Gick
e2b7233077 fix: Clone with --recurse-submodules for yuv-sys/libyuv, add make/clang
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 16:53:57 +02:00
Christian Gick
2e040a9086 fix: Add nasm for yuv-sys build
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 16:47:52 +02:00
Christian Gick
78cae61b90 fix: Add libva-dev and libglib2.0-dev for webrtc-sys build
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 16:42:25 +02:00
Christian Gick
10de057829 fix: Use rust:latest for time crate MSRV 1.88
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 16:33:58 +02:00
Christian Gick
85e76a468f fix: Use Rust 1.85 for edition2024 crate support
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 16:32:06 +02:00
Christian Gick
2234086791 chore: Trigger rebuild 2026-02-20 16:29:56 +02:00
Christian Gick
9b95f05488 chore: Trigger rebuild 2026-02-20 16:29:25 +02:00
Christian Gick
fc3d915939 feat(e2ee): Add HKDF E2EE support for Element Call compatibility
Element Call uses HKDF-SHA256 + AES-128-GCM for frame encryption,
while the LiveKit Rust SDK defaults to PBKDF2 + AES-256-GCM.

- Multi-stage Dockerfile builds patched Rust FFI from EC-compat fork
- Generates Python protobuf bindings with new fields
- patch_sdk.py modifies installed livekit-rtc for new proto fields
- agent.py passes E2EE options with HKDF to ctx.connect()
- bot.py exchanges encryption keys via Matrix state events
- Separate Dockerfile.bot for bot service (no Rust build needed)

Ref: livekit/rust-sdks#904, livekit/python-sdks#570

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 16:29:06 +02:00
Christian Gick
578b6bb56f fix: compute correct LiveKit room name hash for Element Call
Element Call uses SHA256(room_id + "|m.call#ROOM") encoded as unpadded
base64 for LiveKit room names (via lk-jwt-service). The bot was using
the raw Matrix room ID, causing agent and user to join different rooms.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 07:16:35 +00:00
Christian Gick
39552d5e90 chore: Trigger rebuild 2026-02-20 06:26:50 +02:00
Christian Gick
4b3dc11ae3 chore: Trigger rebuild 2026-02-20 06:26:24 +02:00
Christian Gick
4cd7a0262e feat: Replace JSON memory with pgvector semantic search (MAT-11)
Add memory-service (FastAPI + pgvector) for semantic memory storage.
Bot now queries relevant memories per conversation instead of dumping all 50.
Includes migration script for existing JSON files.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 06:25:50 +02:00
Christian Gick
0c674f1467 chore: Trigger rebuild 2026-02-19 10:11:31 +02:00
Christian Gick
bf81f7d0b9 fix: Remove !ai memory command — natural conversation only
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 10:11:09 +02:00
Christian Gick
c8661e5aea chore: Trigger rebuild 2026-02-19 10:08:58 +02:00
Christian Gick
b5c33f4701 fix: Fix memory system persistence and consolidate language prefs
- Replace separate bot-crypto/bot-memories volumes with single bot-data:/data
  volume so user_keys.json and language_prefs.json persist across restarts
- Remove redundant language_prefs.json infrastructure (constant, load/save,
  dict) — language preference now read from memories (last match wins)
- Add robust JSON extraction in _extract_memories (regex fallback for
  markdown fences, embedded arrays, non-array responses)
- Add info-level logging throughout memory extraction pipeline
- Add asyncio.wait_for timeout (15s) on memory extraction to prevent hangs
- Add !ai memory <fact> command for explicit, reliable memory storage
- Update _get_preferred_language to return last match (most recent wins)
- Update !ai forget to clear in-memory caches (pending translate/reply)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 09:49:05 +02:00
Christian Gick
2fd5806654 chore: Trigger rebuild 2026-02-19 09:06:24 +02:00
Christian Gick
94bf621490 fix: memory persistence + language auto-detection for translation workflow
- Upgrade memory/translation debug logs from debug to warning level
- Auto-detect language preference from extracted memory facts
- Persist language prefs to separate JSON file for reliability
- Add translation detection logging
- Use single linebreaks in translation menu

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 09:06:07 +02:00
Christian Gick
e431c3fb94 chore: Trigger rebuild 2026-02-19 08:57:06 +02:00
Christian Gick
d6c30abca3 feat: DM translation workflow for forwarded foreign messages
Detect when a DM message is in a foreign language and offer an
interactive menu: translate, compose reply in that language, or
respond normally. Supports forwarded WhatsApp messages via Element.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 08:56:49 +02:00
Christian Gick
2cf69b30df chore: Trigger rebuild 2026-02-19 08:19:27 +02:00
Christian Gick
d7e32acfcb feat: Add persistent user memory system
- Extract and store memorable facts (name, language, preferences) per user
- Inject memories into system prompt for personalized responses
- LLM-based extraction after each response, deduplication against existing
- JSON files on Docker volume (/data/memories), capped at 50 per user
- System prompt updated: respond in users language, use memories
- Commands: !ai memories (view), !ai forget (delete all)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 08:19:12 +02:00
Christian Gick
420b8a1e73 chore: Trigger rebuild 2026-02-19 07:11:22 +02:00
Christian Gick
eef850f7ac fix: Handle encrypted images + link text to recent images
- Add RoomEncryptedImage callback with decrypt_attachment for E2E rooms
- Cache recent images per room (60s TTL) so follow-up text messages
  like "was ist das" get the image context instead of hallucinating
- Treat filenames (containing dots) as no-caption, default to
  "What's in this image?"

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 07:11:07 +02:00
Christian Gick
8fa6b7a49c chore: Trigger rebuild 2026-02-19 06:46:21 +02:00