Commit Graph

356 Commits

Author SHA1 Message Date
Christian Gick
b19300d3ce feat: Add confluence_search tool to voice bot
Voice bot could read/update Confluence pages but could not search.
Users asking to search Confluence got a refusal. Now the voice bot
has search_confluence using CQL queries via the service account.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 12:48:50 +02:00
Christian Gick
a3365626ae chore: Trigger rebuild 2026-02-26 12:39:20 +02:00
Christian Gick
11b80f07c6 chore: Trigger rebuild 2026-02-26 11:08:53 +02:00
Christian Gick
9a879f566d fix: Use Confluence v2 API for page reads
The v1 /wiki/rest/api/content/{id} endpoint returns 410 Gone.
Switch to /wiki/api/v2/pages/{id} with body-format=storage parameter.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 11:08:29 +02:00
Christian Gick
3a5d37fac2 chore: Trigger rebuild 2026-02-26 10:25:07 +02:00
Christian Gick
f3b6f3f2f0 chore: Trigger rebuild 2026-02-26 10:21:02 +02:00
Christian Gick
48f6e7dd17 feat: Add Atlassian tools and agentic tool-calling loop
- Add AtlassianClient class: fetches per-user OAuth tokens from portal,
  calls Jira and Confluence REST APIs on behalf of users
- Add 7 Atlassian tools: confluence_search, confluence_read_page,
  jira_search, jira_get_issue, jira_create_issue, jira_add_comment,
  jira_transition
- Replace single LLM call with agentic loop (max 5 iterations)
  that feeds tool results back to the model
- Add PORTAL_URL and BOT_API_KEY env vars to docker-compose
- Update system prompt with Atlassian tool guidance

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 10:15:15 +02:00
Christian Gick
08a3c4a9cc refactor(CF-1812): Replace inline confluence-collab copy with git submodule
Single source of truth at christian/confluence-collab.git — eliminates stale copy drift.
Dockerfile COPY unchanged, works identically with submodule.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 12:30:31 +02:00
Christian Gick
9958fb9b6b fix: Update confluence-collab proxy with proper async lifecycle (CF-1812)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 11:51:29 +02:00
Christian Gick
b492abe0c9 fix: Copy confluence-collab package instead of symlink for Docker build
Symlinks dont resolve on remote VM during Docker build context.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 11:38:39 +02:00
Christian Gick
3ea4d5abc8 chore: Trigger rebuild 2026-02-24 11:38:00 +02:00
Christian Gick
9e146da3b0 feat(CF-1812): Use confluence-collab for section-based page editing
Replace inline regex section parser in voice.py with confluence_collab
library (BS4 parsing, 409 conflict retry). Bot now loads section outline
into LLM context when Confluence links are detected.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 11:37:37 +02:00
Christian Gick
3e60e822be fix: Text bot now reads Confluence pages and includes room docs in LLM context
Three issues fixed:
1. Confluence URLs were detected but content never fetched - now reads
   the actual page via API so the LLM can work with it
2. Room document context (PDFs, Confluence, images) was stored but never
   passed to the text LLM - now included as system message
3. Conversation history increased from 10 to 30 messages for better
   context in collaborative sessions

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 08:03:45 +02:00
Christian Gick
326a874aa7 feat: Add on-demand camera/screen vision via look_at_screen tool
Voice bot can now see the users camera or screen share when asked.
Captures a single frame, encodes as JPEG, sends to Sonnet vision
with full context (transcript + document). Triggered by phrases like
schau mal, siehst du das, can you see this.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 06:36:52 +02:00
Christian Gick
cfb26fb351 feat: Add doubt triggers to think_deeper tool
"bist du dir sicher" / "are you sure" / "stimmt das wirklich" now also
trigger Opus escalation for fact-checking the previous answer.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 06:23:51 +02:00
Christian Gick
6081f9a7ec feat(MAT-46): Add think_deeper tool for Opus escalation in voice calls
Sonnet can now escalate complex questions to Opus via a function tool,
same pattern as search_web and read_confluence_page. Full context
(transcript + document) is passed automatically. Triggered by user
phrases like "denk genauer nach" / "think harder" or when Sonnet is
unsure about complex analysis.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 06:13:44 +02:00
Christian Gick
de66ba5eea feat(MAT-46): Extract and post document annotations after voice calls
When a voice call ends and a document was loaded in the room, the bot
now analyzes the transcript for document-specific changes/corrections
and posts them as a structured "Dokument-Aenderungen" message. Returns
nothing if no document changes were discussed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 20:18:00 +02:00
Christian Gick
a4b5c5da86 chore: Trigger rebuild 2026-02-23 19:56:53 +02:00
Christian Gick
6a6f9ef1c4 fix(voice): auto-use active Confluence page ID, allow roleplay on docs
- Confluence tools default to active page from room context — no more
  asking user for page_id
- Prompt allows roleplay/mock interviews when document context present
- Explicit instruction not to ask for page_id

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 14:31:49 +02:00
Christian Gick
c5e1c79e1b fix(voice): reduce phantom speech responses from ambient noise
- Raise VAD activation_threshold 0.50→0.65, min_speech_duration 0.2→0.4s
- Add ghost phrase filter: suppress 1-2 word hallucinations (Danke, Ja, etc)
- Strengthen prompt: stay silent unless clearly addressed

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 13:48:14 +02:00
Christian Gick
4a0679d1dc fix(bot): resolve Confluence short links (/wiki/x/...) and add env vars
Short links like /wiki/x/AQDbAw are resolved via redirect to get numeric
page ID. Also adds CONFLUENCE_* env var declarations to bot.py module level.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 13:15:43 +02:00
Christian Gick
b275e7cb88 feat(voice): add Confluence read/write tools for voice sessions
Enable realtime Confluence page editing during Element Call voice sessions.
- Add read_confluence_page and update_confluence_page function tools
- Detect Confluence URLs shared in Matrix rooms, store page ID for voice context
- Section-level updates via heading match + version-incremented PUT

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 13:09:34 +02:00
Christian Gick
e81aa79396 fix: increase voice PDF context to 40k chars, fix language detection sanity
- Voice context per-document limit 10k→40k chars (was cutting off at page 6)
- Language detection: reject results >30 chars (LLM returning sentences)
- Voice.py: generalize "PDF" label to "Dokumente"

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 12:40:13 +02:00
Christian Gick
751bfbd164 fix: encrypted file handler + summary heading/markup fixes
- Add RoomEncryptedFile handler for PDFs/docs in encrypted rooms
- Tell summary LLM not to include headings (prevents duplicate)
- Strip <br/> after block elements in _md_to_html

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 12:00:10 +02:00
Christian Gick
040d4c9285 fix(markup): add heading support to _md_to_html (h1/h2/h3)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 11:51:48 +02:00
Christian Gick
42ba3c09d0 feat(voice): all file types + images in voice context (MAT-10)
Generalize PDF-only voice context to support all document types:
- Rename _room_pdf_context → _room_document_context (list-based, 5 cap)
- Handle .docx (python-docx), .txt, .md, .csv, .json, .xml, .html, .yaml, .log
- Store AI image descriptions for voice context
- Multi-document context building with type labels and per-type truncation
- _respond_with_ai now returns reply text for caller use

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 11:45:54 +02:00
Christian Gick
90e662be96 feat(voice): PDF context in voice calls + call transcript summary (MAT-10)
Pass PDF document context from room to voice session so the voice LLM
can answer questions about uploaded PDFs. Persist call transcripts and
post an LLM-generated summary to the room when the call ends.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 11:21:31 +02:00
Christian Gick
1ec63b93f2 feat(voice): per-user timezone via memory preferences
- Store user timezone as [PREF:timezone] in memory service
- Query timezone preference on session start, override default
- Add set_user_timezone tool so bot learns timezone from conversation
- On time-relevant questions, bot asks if user is still at stored location
- Seeded Europe/Nicosia for @christian.gick:agiliton.eu

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 11:02:25 +02:00
Christian Gick
e84260f839 feat(prompt): add user timezone and LLM model to voice prompt
Bot now knows the user's timezone (Europe/Berlin default) and which
LLM model it's running on, so it can answer questions about both.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 10:56:40 +02:00
Christian Gick
277d6b5fe4 fix(e2ee): restore 3s key rotation wait, fix mute callback arg order
Removing the blocking wait entirely caused DEC_FAILED - the rotated key
had not arrived via nio sync before the pipeline started. Restore a short
3s wait (down from 10s) which is enough for nio to deliver the rotated key.

Also fix on_mute/on_unmute arg order (participant, publication - not reversed).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 10:43:38 +02:00
Christian Gick
a11cafc1d6 feat(memory): store full conversation exchanges instead of LLM-extracted facts
- Replace _extract_voice_memories with _store_voice_exchange
- Store raw "User: ... / Assistant: ..." pairs directly
- No LLM call needed — faster, cheaper, no lost context
- Load as "Frühere Gespräche" with full thread context

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 10:40:59 +02:00
Christian Gick
150df19be1 fix(tts): revert to multilingual_v2 for better quality, keep speed 1.15x
flash_v2_5 had audible compression artifacts. multilingual_v2 has higher
fidelity while speed=1.15 via VoiceSettings still gives snappier delivery.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 10:38:46 +02:00
Christian Gick
294fbac913 feat(tts): switch to flash model + speed 1.15x for snappier voice
- Model: eleven_multilingual_v2 → eleven_flash_v2_5 (lower latency)
- Speed: 1.15x via VoiceSettings
- Stability/similarity tuned for natural German speech

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 10:33:27 +02:00
Christian Gick
6443aa0668 chore: Trigger rebuild 2026-02-23 08:41:02 +02:00
Christian Gick
c532f4678d fix(e2ee): consolidate key timing + noise filtering (MAT-40, MAT-41)
- set_key() only called after frame cryptor exists (on_track_subscribed / late arrival)
- Remove 10s blocking key rotation wait; keys applied asynchronously
- Add DEC_FAILED (state 3) to e2ee_state recovery triggers
- VAD watchdog re-applies all E2EE keys on >30s stuck as recovery
- Expand STT artifact patterns (English variants, double-asterisk)
- Add NOISE_LEAK diagnostic logging at STT level

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 08:33:40 +02:00
Christian Gick
4b4a150fbf fix(e2ee): extend key rotation wait to 10s, debug late key events
EC rotates encryption key when bot joins LiveKit room. The rotated
key arrives via Matrix sync 3-5s later. Previous 2s wait was too
short - DEC_FAILED before new key arrived.

Extended wait to 10s. Added logging to bot.py to trace why late
key events were not being processed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-22 21:54:27 +02:00
Christian Gick
230c083b7b fix(e2ee): revert incorrect HKDF patch, remove pre-ratcheting
The HKDF sed patch in Dockerfile was wrong — it swapped salt/info
based on incorrect analysis of minified JS. The original Rust FFI
parameters are correct: salt="LKFrameEncryptionKey", info=[0;128].

Also removed Python-side HMAC pre-ratcheting of keys. Element Call
uses explicit key rotation via Matrix events, not HMAC ratcheting.

Added diagnostic logging to trace exact key bytes during E2EE setup.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-22 21:44:11 +02:00
Christian Gick
d30e9f8c83 chore: Trigger rebuild 2026-02-22 21:14:47 +02:00
Christian Gick
62be6b91d6 fix(e2ee): patch Rust FFI HKDF to match Element Call JS SDK parameters
EC JS SDK uses: salt=Uint8Array(8), info=encode("LKFrameEncryptionKey")
Rust FFI used: salt=ratchet_salt, info=[0u8;128]

The salt and info parameters were swapped, causing DEC_FAILED on every
call. This patch fixes the Rust HKDF derivation in the Dockerfile
before cargo build.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-22 21:14:31 +02:00
Christian Gick
220ad6cced chore: Trigger rebuild 2026-02-22 20:14:20 +02:00
Christian Gick
ea52236880 feat(e2ee): make E2EE configurable via E2EE_ENABLED env var
Allows disabling E2EE for diagnostic purposes. When disabled, bot
connects to LiveKit without frame encryption.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-22 20:14:06 +02:00
Christian Gick
5bfe0d0188 chore: Trigger rebuild 2026-02-22 20:01:14 +02:00
Christian Gick
e3be4512d9 fix(e2ee): use correct Element Call E2EE parameters
Inline E2EE options had 3 wrong values vs Element Call JS SDK:
- failure_tolerance=-1 (infinite, hid all DEC_FAILED) → 10
- key_ring_size=16 (too small, keys overflow) → 256
- ratchet_window_size=16 (wrong) → 10

Now uses _build_e2ee_options() which was already correct but never called.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-22 20:00:55 +02:00
Christian Gick
c2338fca46 chore: Trigger rebuild 2026-02-22 19:45:13 +02:00
Christian Gick
7b7079352f fix(noise): expand STT artifact filter to catch subtitle metadata leaks
ElevenLabs scribe_v2_realtime also produces non-asterisk artifacts like
"Untertitel: ARD Text im Auftrag von Funk (2017)" from TV/radio audio.
Add pattern matching for subtitle metadata, copyright notices, and
parenthetical/bracketed annotations.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-22 19:43:22 +02:00
Christian Gick
5984132f60 chore: Trigger rebuild 2026-02-22 19:38:06 +02:00
Christian Gick
9e0f2a15b6 chore: Trigger rebuild 2026-02-22 19:35:11 +02:00
Christian Gick
c38ab96054 chore(voice): switch to Robert Ranger voice
Replace Jack Marlowe (slow/raw) with Robert Ranger (deep/natural) for
a more pleasant conversational voice assistant experience.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-22 19:34:54 +02:00
Christian Gick
38c3d93adf chore: Trigger rebuild 2026-02-22 19:07:45 +02:00
Christian Gick
fa9e95b250 fix(noise): filter STT noise annotations via on_user_turn_completed
Replace broken _VoiceAgent stt_node override with _NoiseFilterAgent that uses
on_user_turn_completed() + StopResponse. This operates downstream of VAD+STT
so no backpressure risk to the audio pipeline.

When ElevenLabs scribe_v2_realtime produces *Störgeräusche* etc., the agent
now silently suppresses them before the LLM responds. The prompt-based filter
is kept as defense-in-depth.

Fixes: MAT-41

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-22 19:07:31 +02:00