Commit Graph

69 Commits

Author SHA1 Message Date
Christian Gick
9d2e2ddcf7 fix(MAT-13): Add DNS fallback via web search for browse_url
When browse_url fails with DNS resolution error (common with STT-misrecognized
domain names like "klicksports" instead of "clicksports"), automatically try a
web search to find the correct domain and retry. Applied to both text and voice bot.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-01 16:41:37 +02:00
Christian Gick
fb54ac2bea feat(MAT-13): Add conversation chunk RAG for Matrix chat history
Add semantic search over past conversations alongside existing memory facts.
New conversation_chunks table stores user-assistant exchanges with LLM-generated
summaries embedded for retrieval. Bot queries chunks on each message and injects
relevant past conversations into the system prompt. New exchanges are indexed
automatically after each bot response.

Memory-service: /chunks/store, /chunks/query, /chunks/bulk-store endpoints
Bot: chunk query + formatting, live indexing via asyncio.gather with memory extraction

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-01 07:48:19 +02:00
Christian Gick
6fe9607fb1 feat: Add web page browsing tool (browse_url) to voice and text bot
Both bots can now fetch and read web pages via browse_url tool.
Uses httpx + BeautifulSoup to extract clean text from HTML.
Complements existing web_search (Brave) with full page reading.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 16:26:17 +02:00
Christian Gick
34f403a066 feat(MAT-65): Remove WildFiles org-level fallback, require per-user key
No more shared org-level document search for unauthenticated users.
DocumentRAG.search() now returns empty if no API key provided.
Explicit !ai search command tells users to connect first.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 16:21:01 +02:00
Christian Gick
18607e39b5 fix(MAT-64): Convert --- to proper <hr/> in markdown-to-HTML
The _md_to_html method was missing horizontal rule conversion, so ---
rendered as literal dashes. Now converts to <hr/> and strips adjacent
<br/> tags for clean spacing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 13:54:24 +02:00
Christian Gick
7915d11463 fix(MAT-64): Ban headings and horizontal rules for compact output
System prompt now strictly forbids #/##/### headings and --- rules.
Uses **bold** for section titles instead, with no blank lines between
title and content, to eliminate excessive whitespace in Element.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 13:48:57 +02:00
Christian Gick
490822f3c3 fix(MAT-64): Inline source links and compact formatting
- System prompt now requires inline source links next to each claim
  instead of a separate "Quellen:" section at the bottom
- Use bold for sub-headings instead of ## to reduce padding/whitespace
- Limit horizontal rules for tighter message layout

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 13:45:49 +02:00
Christian Gick
1db4f1f3bd fix(MAT-64): Improve web search formatting and require source links
- Format search results as markdown links: [Title](URL)
- System prompt now requires a "Quellen:/Sources:" section with
  clickable links whenever web_search is used

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 13:38:54 +02:00
Christian Gick
2826455036 feat(MAT-64): Add web search tool to text bot
The text bot had no websearch capability while the voice agent did.
Added Brave Search integration as a web_search tool so the bot can
answer questions about current events and look up information.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 13:30:36 +02:00
Christian Gick
40a99c73f7 fix: Remove translation detection workflow from DM handler
The auto-detect language + translation menu was misidentifying regular
German messages and blocking normal responses. Bot now simply responds
in whatever language the user writes in, per updated system prompt.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 08:47:33 +02:00
Christian Gick
7791a5ba8e feat: add Confluence recent pages + Sentry error tracking (MAT-58, MAT-59)
MAT-58: Add recent_confluence_pages tool to both voice and text chat.
Shows last 5 recently modified pages so users can pick directly
instead of having to search every time.

MAT-59: Integrate sentry-sdk in all three entry points (agent.py,
bot.py, voice.py). SENTRY_DSN env var, traces at 10% sample rate.
Requires creating project in Sentry UI and setting DSN.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 08:44:57 +02:00
Christian Gick
10762a53da feat(MAT-57): Add Confluence write & create tools to voice and text chat
- Add create_confluence_page tool to voice mode (basic auth)
- Add confluence_update_page and confluence_create_page tools to text chat (OAuth)
- Fix update tool: wrap each paragraph in <p> tags instead of single wrapper
- Update system prompt to mention create capability

Previously only search/read were available. User reported bot couldn't
write to or create Confluence pages — because the tools didn't exist.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 08:04:01 +02:00
Christian Gick
9a879f566d fix: Use Confluence v2 API for page reads
The v1 /wiki/rest/api/content/{id} endpoint returns 410 Gone.
Switch to /wiki/api/v2/pages/{id} with body-format=storage parameter.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 11:08:29 +02:00
Christian Gick
48f6e7dd17 feat: Add Atlassian tools and agentic tool-calling loop
- Add AtlassianClient class: fetches per-user OAuth tokens from portal,
  calls Jira and Confluence REST APIs on behalf of users
- Add 7 Atlassian tools: confluence_search, confluence_read_page,
  jira_search, jira_get_issue, jira_create_issue, jira_add_comment,
  jira_transition
- Replace single LLM call with agentic loop (max 5 iterations)
  that feeds tool results back to the model
- Add PORTAL_URL and BOT_API_KEY env vars to docker-compose
- Update system prompt with Atlassian tool guidance

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 10:15:15 +02:00
Christian Gick
9e146da3b0 feat(CF-1812): Use confluence-collab for section-based page editing
Replace inline regex section parser in voice.py with confluence_collab
library (BS4 parsing, 409 conflict retry). Bot now loads section outline
into LLM context when Confluence links are detected.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 11:37:37 +02:00
Christian Gick
3e60e822be fix: Text bot now reads Confluence pages and includes room docs in LLM context
Three issues fixed:
1. Confluence URLs were detected but content never fetched - now reads
   the actual page via API so the LLM can work with it
2. Room document context (PDFs, Confluence, images) was stored but never
   passed to the text LLM - now included as system message
3. Conversation history increased from 10 to 30 messages for better
   context in collaborative sessions

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 08:03:45 +02:00
Christian Gick
de66ba5eea feat(MAT-46): Extract and post document annotations after voice calls
When a voice call ends and a document was loaded in the room, the bot
now analyzes the transcript for document-specific changes/corrections
and posts them as a structured "Dokument-Aenderungen" message. Returns
nothing if no document changes were discussed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 20:18:00 +02:00
Christian Gick
4a0679d1dc fix(bot): resolve Confluence short links (/wiki/x/...) and add env vars
Short links like /wiki/x/AQDbAw are resolved via redirect to get numeric
page ID. Also adds CONFLUENCE_* env var declarations to bot.py module level.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 13:15:43 +02:00
Christian Gick
b275e7cb88 feat(voice): add Confluence read/write tools for voice sessions
Enable realtime Confluence page editing during Element Call voice sessions.
- Add read_confluence_page and update_confluence_page function tools
- Detect Confluence URLs shared in Matrix rooms, store page ID for voice context
- Section-level updates via heading match + version-incremented PUT

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 13:09:34 +02:00
Christian Gick
e81aa79396 fix: increase voice PDF context to 40k chars, fix language detection sanity
- Voice context per-document limit 10k→40k chars (was cutting off at page 6)
- Language detection: reject results >30 chars (LLM returning sentences)
- Voice.py: generalize "PDF" label to "Dokumente"

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 12:40:13 +02:00
Christian Gick
751bfbd164 fix: encrypted file handler + summary heading/markup fixes
- Add RoomEncryptedFile handler for PDFs/docs in encrypted rooms
- Tell summary LLM not to include headings (prevents duplicate)
- Strip <br/> after block elements in _md_to_html

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 12:00:10 +02:00
Christian Gick
040d4c9285 fix(markup): add heading support to _md_to_html (h1/h2/h3)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 11:51:48 +02:00
Christian Gick
42ba3c09d0 feat(voice): all file types + images in voice context (MAT-10)
Generalize PDF-only voice context to support all document types:
- Rename _room_pdf_context → _room_document_context (list-based, 5 cap)
- Handle .docx (python-docx), .txt, .md, .csv, .json, .xml, .html, .yaml, .log
- Store AI image descriptions for voice context
- Multi-document context building with type labels and per-type truncation
- _respond_with_ai now returns reply text for caller use

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 11:45:54 +02:00
Christian Gick
90e662be96 feat(voice): PDF context in voice calls + call transcript summary (MAT-10)
Pass PDF document context from room to voice session so the voice LLM
can answer questions about uploaded PDFs. Persist call transcripts and
post an LLM-generated summary to the room when the call ends.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 11:21:31 +02:00
Christian Gick
4b4a150fbf fix(e2ee): extend key rotation wait to 10s, debug late key events
EC rotates encryption key when bot joins LiveKit room. The rotated
key arrives via Matrix sync 3-5s later. Previous 2s wait was too
short - DEC_FAILED before new key arrived.

Extended wait to 10s. Added logging to bot.py to trace why late
key events were not being processed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-22 21:54:27 +02:00
Christian Gick
52f8cb569c feat(voice): add cross-call memory and Brave Search tool
- Query user memories at call start and inject into agent system prompt
- Extract new facts after each exchange using claude-haiku via LiteLLM
- Add Brave Search tool (@function_tool) for current data queries
- Pass memory client and caller_user_id through VoiceSession constructor
- Pre-compute 8 HMAC-ratcheted EC keys for reliable E2EE decryption

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-22 15:27:59 +02:00
Christian Gick
2b8744de6e fix(voice): full E2EE bidirectional audio pipeline working
- bot.py: track active callers per room; only stop session when last
  caller leaves (fixes premature cancellation when Playwright browser
  hangs up while real app is still in call)

- voice.py: pre-compute 8 HMAC-ratcheted keys from EC's base key so
  decryption works immediately without waiting ~30s for Matrix to
  deliver EC's key-rotation event (root cause of user→bot silence)

- voice.py: fix set_key() argument order (identity, key, index) at all
  call sites — was (identity, index, key) causing TypeError

- voice.py: add audio frame monitor (AUDIO_FLOW) and mute/unmute event
  handlers for diagnostics

- voice.py: update livekit-agents 1.4.2 event names: user_state_changed,
  user_input_transcribed, conversation_item_added

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-22 15:17:35 +02:00
Christian Gick
b65d04389b fix: Switch E2EE to per-participant keys instead of shared key
Element Call uses per-participant keys, not shared key mode.
Bot now generates its own key, publishes it, and sets both
keys via key_provider.set_key() after connecting.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-22 06:41:20 +02:00
Christian Gick
4a93827de3 revert: Restore voice.py and bot.py to last known working state (9aef846)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 20:47:51 +02:00
Christian Gick
533847c952 fix: Switch E2EE from shared key to per-participant key mode
Element Call uses per-participant keys via MatrixKeyProvider.onSetEncryptionKey(),
not shared key mode. This was causing silence with E2EE enabled.

- Set bot's own key and caller's key separately via e2ee_manager.key_provider.set_key()
- Live-update caller key when received after connect
- Fallback to set_shared_key if per-participant API unavailable

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 18:50:19 +02:00
Christian Gick
1bc044eaae fix: republish caller E2EE key as shared key, fallback to no-E2EE
Bot now publishes the same key as the caller so both sides can decrypt.
Falls back to no-encryption if no caller key received.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 17:35:31 +02:00
Christian Gick
08f4e115b9 fix: filter own key events, fix RoomOptions None, wait for participant
- Skip bot own encryption_keys events in on_unknown handler
- Always pass valid RoomOptions to AgentSession.start()
- Wait up to 10s for remote participant to connect before starting pipeline

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 17:33:12 +02:00
Christian Gick
6e1e9839cc fix: use timeline events for E2EE key exchange (not state events)
Element Call distributes encryption keys as timeline events, not room
state events. Changed bot to publish keys via room_send and fetch from
/messages endpoint instead of /state.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 17:28:56 +02:00
Christian Gick
85df4b295f fix: E2EE key timing + verbose logging + shorter greeting
- Reorder: send call member event BEFORE creating VoiceSession
- Store VoiceSession BEFORE start so sync handler can forward keys
- Increase E2EE key wait from 3s to 10s
- Add INFO-level logging for key lookup + room state scan via HTTP API
- Tighten voice system prompt to prevent long rambling greetings

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 14:55:52 +02:00
Christian Gick
80582860b9 fix: E2EE key lookup for Element Call voice sessions
- Fix state_key format: try @user:domain:DEVICE_ID (Element Call format),
  then @user:domain, then scan all room state as fallback
- Publish bot E2EE key to room so Element shows encrypted status
- Extract caller device_id from call member event content
- Also fix pipecat-poc pipeline with context aggregators (CF-1579)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 14:51:26 +02:00
Christian Gick
85f8df5690 fix: VoiceSession cleanup on call leave + CXXABI compat + proactive E2EE key read
- Stop VoiceSession when call leave event received
- Copy libstdc++ from rust build stage to fix CXXABI_1.3.15 mismatch
- Read caller encryption key from room state before starting VoiceSession

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 17:21:51 +02:00
Christian Gick
fc3d915939 feat(e2ee): Add HKDF E2EE support for Element Call compatibility
Element Call uses HKDF-SHA256 + AES-128-GCM for frame encryption,
while the LiveKit Rust SDK defaults to PBKDF2 + AES-256-GCM.

- Multi-stage Dockerfile builds patched Rust FFI from EC-compat fork
- Generates Python protobuf bindings with new fields
- patch_sdk.py modifies installed livekit-rtc for new proto fields
- agent.py passes E2EE options with HKDF to ctx.connect()
- bot.py exchanges encryption keys via Matrix state events
- Separate Dockerfile.bot for bot service (no Rust build needed)

Ref: livekit/rust-sdks#904, livekit/python-sdks#570

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 16:29:06 +02:00
Christian Gick
578b6bb56f fix: compute correct LiveKit room name hash for Element Call
Element Call uses SHA256(room_id + "|m.call#ROOM") encoded as unpadded
base64 for LiveKit room names (via lk-jwt-service). The bot was using
the raw Matrix room ID, causing agent and user to join different rooms.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 07:16:35 +00:00
Christian Gick
4cd7a0262e feat: Replace JSON memory with pgvector semantic search (MAT-11)
Add memory-service (FastAPI + pgvector) for semantic memory storage.
Bot now queries relevant memories per conversation instead of dumping all 50.
Includes migration script for existing JSON files.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 06:25:50 +02:00
Christian Gick
bf81f7d0b9 fix: Remove !ai memory command — natural conversation only
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 10:11:09 +02:00
Christian Gick
b5c33f4701 fix: Fix memory system persistence and consolidate language prefs
- Replace separate bot-crypto/bot-memories volumes with single bot-data:/data
  volume so user_keys.json and language_prefs.json persist across restarts
- Remove redundant language_prefs.json infrastructure (constant, load/save,
  dict) — language preference now read from memories (last match wins)
- Add robust JSON extraction in _extract_memories (regex fallback for
  markdown fences, embedded arrays, non-array responses)
- Add info-level logging throughout memory extraction pipeline
- Add asyncio.wait_for timeout (15s) on memory extraction to prevent hangs
- Add !ai memory <fact> command for explicit, reliable memory storage
- Update _get_preferred_language to return last match (most recent wins)
- Update !ai forget to clear in-memory caches (pending translate/reply)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 09:49:05 +02:00
Christian Gick
94bf621490 fix: memory persistence + language auto-detection for translation workflow
- Upgrade memory/translation debug logs from debug to warning level
- Auto-detect language preference from extracted memory facts
- Persist language prefs to separate JSON file for reliability
- Add translation detection logging
- Use single linebreaks in translation menu

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 09:06:07 +02:00
Christian Gick
d6c30abca3 feat: DM translation workflow for forwarded foreign messages
Detect when a DM message is in a foreign language and offer an
interactive menu: translate, compose reply in that language, or
respond normally. Supports forwarded WhatsApp messages via Element.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 08:56:49 +02:00
Christian Gick
d7e32acfcb feat: Add persistent user memory system
- Extract and store memorable facts (name, language, preferences) per user
- Inject memories into system prompt for personalized responses
- LLM-based extraction after each response, deduplication against existing
- JSON files on Docker volume (/data/memories), capped at 50 per user
- System prompt updated: respond in users language, use memories
- Commands: !ai memories (view), !ai forget (delete all)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 08:19:12 +02:00
Christian Gick
eef850f7ac fix: Handle encrypted images + link text to recent images
- Add RoomEncryptedImage callback with decrypt_attachment for E2E rooms
- Cache recent images per room (60s TTL) so follow-up text messages
  like "was ist das" get the image context instead of hallucinating
- Treat filenames (containing dots) as no-caption, default to
  "What's in this image?"

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 07:11:07 +02:00
Christian Gick
2199de47f9 fix: Handle encrypted image upload for Matrix rooms (MAT-9)
Upload with encrypt=True and filesize param. Handle UploadError
gracefully. Use m.file encrypted format when encryption keys returned.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 22:35:21 +02:00
Christian Gick
5c5f442a74 feat: Add PDF reading support to Matrix AI bot (MAT-10)
- Register RoomMessageFile callback, filter for application/pdf
- Extract text from PDFs using pymupdf (fitz)
- Send extracted text as context to LLM for summarization/Q&A
- Truncate at 50k chars to avoid token limits
- Add pymupdf to requirements.txt

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 22:09:24 +02:00
Christian Gick
8b08056e0a feat: Add image reading and generation to Matrix AI bot (MAT-9)
- Register RoomMessageImage callback to handle incoming images
- Download and base64-encode images, send as multimodal content to LLM
- Add LLM tool calling with generate_image tool for natural image generation
- Upload generated images back to Matrix via m.image events
- Update system prompt to inform LLM about vision and image gen capabilities

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 21:54:45 +02:00
Christian Gick
f51e8d95e0 fix: Rename connect/disconnect to wildfiles connect/disconnect (WF-90)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 15:01:45 +02:00
Christian Gick
37757379ff feat: Add SSO device auth flow for !ai connect (WF-90)
!ai connect (no args) now starts a browser-based device authorization
flow instead of requiring a raw API key. Direct key input preserved
as fallback. Bot polls WildFiles for approval with 5s interval.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 14:33:35 +02:00