feat: Add on-demand camera/screen vision via look_at_screen tool

Voice bot can now see the users camera or screen share when asked.
Captures a single frame, encodes as JPEG, sends to Sonnet vision
with full context (transcript + document). Triggered by phrases like
schau mal, siehst du das, can you see this.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Christian Gick
2026-02-24 06:36:52 +02:00
parent cfb26fb351
commit 326a874aa7
2 changed files with 101 additions and 2 deletions

View File

@@ -10,3 +10,4 @@ httpx>=0.27,<1.0
openai>=2.0,<3.0
pymupdf>=1.24,<2.0
python-docx>=1.0,<2.0
Pillow>=10.0,<12.0