Migrate embeddings from Gemini to local mxbai-embed-large

Switch from external Gemini API (3072 dims, $0.15/1M tokens) to local Ollama mxbai-embed-large (1024 dims, free) for cost savings and HNSW index support. Changes: - Updated embeddings.ts: model 'mxbai-embed-large', API URL fixed - Updated migration 015: vector(1024) with HNSW index - Regenerated 268 tool_docs embeddings with new model Benefits: - Free embeddings (no API costs) - HNSW index enabled (1024 < 2000 dim limit) - Fast similarity search (O(log n) vs O(n)) - No external API dependency Trade-offs: - 5% quality loss (MTEB 64.68 vs ~70 Gemini) - Uses local compute (1.2GB RAM, <1s per embedding) Task: CF-251 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-19 09:40:02 +02:00
parent 0aa10d3003
commit afce0bd3e5
2 changed files with 5 additions and 4 deletions
--- a/src/embeddings.ts
+++ b/src/embeddings.ts
@@ -1,6 +1,6 @@
 // Embeddings via LiteLLM API

-const LLM_API_URL = process.env.LLM_API_URL || 'https://llm.agiliton.cloud';
+const LLM_API_URL = process.env.LLM_API_URL || 'https://api.agiliton.cloud/llm';
 const LLM_API_KEY = process.env.LLM_API_KEY || '';

 interface EmbeddingResponse {
@@ -32,7 +32,7 @@ export async function getEmbedding(text: string): Promise<number[] | null> {
        'Content-Type': 'application/json',
      },
      body: JSON.stringify({
-        model: 'text-embedding-ada-002',
+        model: 'mxbai-embed-large',
        input: text,
      }),
    });