Embedding Models
What are Embeddings?
Content in your knowledge bases is automatically converted into vector embeddings - numerical representations that capture semantic meaning. When you search, your query is also embedded, and similar content is found by comparing vector distances using cosine similarity.
Supported Models
Seclai supports a wide range of embedding models from multiple providers:
OpenAI Models
- text-embedding-3-large - Up to 3072 dimensions (supports 256, 384, 512, 768, 1024, 1536, 2048, 2560, 3072)
- text-embedding-3-small - Up to 1536 dimensions (supports 256, 384, 512, 768, 1024, 1536)
- text-embedding-ada-002 - 1536 dimensions
Google Vertex AI Models
- gemini-embedding-001 - Up to 3072 dimensions (supports 256, 384, 512, 768, 1024, 1536, 2048, 2560, 3072)
- text-embedding-005 - Up to 768 dimensions (supports 256, 384, 512, 768)
- text-multilingual-embedding-002 - Up to 768 dimensions (supports 256, 384, 512, 768)
- e5-large - 1024 dimensions
- e5-small - 256 dimensions
AWS Bedrock Models
- amazon.nova-2-multimodal-embeddings-v1:0 - Up to 3072 dimensions (supports 256, 384, 1024, 3072)
- amazon.titan-embed-text-v2:0 - Up to 1024 dimensions (supports 256, 512, 1024)
- cohere.embed-v4:0 - Up to 1536 dimensions (supports 256, 512, 1024, 1536)
Choosing a Model
When selecting an embedding model, consider:
- Dimensions: Higher dimensions can capture more nuance but increase storage and compute costs
- Language support: Some models like multilingual-embedding-002 are optimized for multiple languages
- Provider: Choose based on your existing cloud infrastructure and API preferences
- Performance: Different models have different speed vs. quality tradeoffs
Configurable Dimensions
Many models support multiple dimension sizes. Lower dimensions reduce:
- Storage costs
- Query latency
- Indexing time
While potentially trading off some semantic precision.
Reranking
After the initial similarity search using embeddings, you can optionally apply a reranking model to improve the relevance and quality of results. Reranking re-evaluates the retrieved content and reorders it based on deeper semantic understanding of the query.
How Reranking Works
The retrieval pipeline works in two stages:
- Initial Retrieval - Vector similarity search finds potentially relevant content quickly using cosine similarity
- Reranking (Optional) - A specialized model re-scores the top results for better relevance
Reranking models are more computationally expensive than embedding similarity, so they're applied only to the top candidates from the initial search.
Supported Reranking Models
Seclai supports several reranking models:
AWS Bedrock Models:
- amazon.rerank-v1:0 - Amazon's reranking model (default)
- cohere.reranker-v3.5:0 - Cohere's high-performance reranker
When to Use Reranking
Consider enabling reranking when:
- Precision is critical - You need the most relevant results at the top
- Semantic nuance matters - Your queries require deep understanding of context
- Large knowledge bases - More content means initial similarity may miss subtle relevance
Reranking is optional and can be configured per knowledge base. If not enabled, results are ranked solely by embedding similarity scores.