Migrate embeddings from Gemini to local mxbai-embed-large

Switch from external Gemini API (3072 dims, $0.15/1M tokens) to local
Ollama mxbai-embed-large (1024 dims, free) for cost savings and HNSW
index support.

Changes:
- Updated embeddings.ts: model 'mxbai-embed-large', API URL fixed
- Updated migration 015: vector(1024) with HNSW index
- Regenerated 268 tool_docs embeddings with new model

Benefits:
- Free embeddings (no API costs)
- HNSW index enabled (1024 < 2000 dim limit)
- Fast similarity search (O(log n) vs O(n))
- No external API dependency

Trade-offs:
- 5% quality loss (MTEB 64.68 vs ~70 Gemini)
- Uses local compute (1.2GB RAM, <1s per embedding)

Task: CF-251

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
Christian Gick
2026-01-19 09:40:02 +02:00
parent 0aa10d3003
commit afce0bd3e5
2 changed files with 5 additions and 4 deletions

View File

@@ -1,6 +1,6 @@
// Embeddings via LiteLLM API
const LLM_API_URL = process.env.LLM_API_URL || 'https://llm.agiliton.cloud';
const LLM_API_URL = process.env.LLM_API_URL || 'https://api.agiliton.cloud/llm';
const LLM_API_KEY = process.env.LLM_API_KEY || '';
interface EmbeddingResponse {
@@ -32,7 +32,7 @@ export async function getEmbedding(text: string): Promise<number[] | null> {
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'text-embedding-ada-002',
model: 'mxbai-embed-large',
input: text,
}),
});