Embedding Models

Embedding Models

What are Embeddings?

Content in your knowledge bases is automatically converted into vector embeddings - numerical representations that capture semantic meaning. When you search, your query is also embedded, and similar content is found by comparing vector distances using cosine similarity.

Supported Models

Seclai supports a wide range of embedding models from multiple providers:

OpenAI Models

  • text-embedding-3-large - Up to 3072 dimensions (supports 256, 384, 512, 768, 1024, 1536, 2048, 2560, 3072)
  • text-embedding-3-small - Up to 1536 dimensions (supports 256, 384, 512, 768, 1024, 1536)
  • text-embedding-ada-002 - 1536 dimensions

Google Vertex AI Models

  • gemini-embedding-001 - Up to 3072 dimensions (supports 256, 384, 512, 768, 1024, 1536, 2048, 2560, 3072)
  • text-embedding-005 - Up to 768 dimensions (supports 256, 384, 512, 768)
  • text-multilingual-embedding-002 - Up to 768 dimensions (supports 256, 384, 512, 768)
  • e5-large - 1024 dimensions
  • e5-small - 256 dimensions

AWS Bedrock Models

  • amazon.nova-2-multimodal-embeddings-v1:0 - Up to 3072 dimensions (supports 256, 384, 1024, 3072)
  • amazon.titan-embed-text-v2:0 - Up to 1024 dimensions (supports 256, 512, 1024)
  • cohere.embed-v4:0 - Up to 1536 dimensions (supports 256, 512, 1024, 1536)

Choosing a Model

When selecting an embedding model, consider:

  • Dimensions: Higher dimensions can capture more nuance but increase storage and compute costs
  • Language support: Some models like multilingual-embedding-002 are optimized for multiple languages
  • Provider: Choose based on your existing cloud infrastructure and API preferences
  • Performance: Different models have different speed vs. quality tradeoffs

Configurable Dimensions

Many models support multiple dimension sizes. Lower dimensions reduce:

  • Storage costs
  • Query latency
  • Indexing time

While potentially trading off some semantic precision.

Reranking

After the initial similarity search using embeddings, you can optionally apply a reranking model to improve the relevance and quality of results. Reranking re-evaluates the retrieved content and reorders it based on deeper semantic understanding of the query.

How Reranking Works

The retrieval pipeline works in two stages:

  1. Initial Retrieval - Vector similarity search finds potentially relevant content quickly using cosine similarity
  2. Reranking (Optional) - A specialized model re-scores the top results for better relevance

Reranking models are more computationally expensive than embedding similarity, so they're applied only to the top candidates from the initial search.

Supported Reranking Models

Seclai supports several reranking models:

AWS Bedrock Models:

  • amazon.rerank-v1:0 - Amazon's reranking model (default)
  • cohere.reranker-v3.5:0 - Cohere's high-performance reranker

When to Use Reranking

Consider enabling reranking when:

  • Precision is critical - You need the most relevant results at the top
  • Semantic nuance matters - Your queries require deep understanding of context
  • Large knowledge bases - More content means initial similarity may miss subtle relevance

Reranking is optional and can be configured per knowledge base. If not enabled, results are ranked solely by embedding similarity scores.