1. Text bot can now capture video frames from active call when user
types vision-related queries ("siehst du meinen bildschirm", etc.)
2. Voice transcript injected into text bot context during active calls
3. Text messages injected into voice transcript with [typed in chat] prefix
4. Bot text replies injected back into voice transcript
This enables seamless context sharing between voice calls and text chat.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
157 KiB
157 KiB