Glossary

Glossary

The following terms are a mix of terms frequently used when working with generative AI, similarity-based search, and cloud-based services.

Agent

An agent is a series of steps that are performed for each run that typically prepare or transform input, retrieve information from a vector database, and call an LLM to generate a response.

Agents can be triggered by dynamic input, scheduled to run periodically, or when content is added or updated in a content database. See Agents for an overview and Agent Steps for a full reference of all step types.

Compaction

Compaction prevents memory bank data from growing without bound. A compaction prompt tells the system how to summarize entries when a configured threshold (max age, max size, or max turns) is exceeded. See Memory Banks → Compaction.

Content Source

A content source represents a source of data that is indexed in a vector database as embeddings optimized for RAG use cases. Memory banks also use a system-managed content source to store embedded entries.

Sources like RSS feeds can be set up to automatically poll for new content, and set to expire content after a certain amount of time to keep storage costs in check for content that does not need to be available for indefinite access. See Content Sources for details.

Conversation Memory

Conversation memory gives agents the ability to remember information across runs using a conversation-type memory bank. Each entry records a speaker (who said what) and is partitioned by a key. Three unified step types are available: Write Memory, Search Memory, and Load Memory. See Agent Steps → Memory.

Credit

Providers of embedding models and LLMs typically charge for usage by the amount of tokens processed. We offer a wide variety of models from many providers, and converting each models usage to tokens provides a unified way to understand usage and cost impact. We could have skipped tokens and translated directly to dollars, but the smallest amounts would be tiny fractions of a dollar-- for example $0.00003214. Our credits are defined so most usage rates fall within 0.1 to 1,000 credits per unit. See Dashboard → Credit Usage to monitor credit consumption.

Embedding

An embedding model translates the given input, such as a piece of text into a vector in a high- dimensional space, such that similar inputs end up with similar coordinates in that space. The more similar, the closer the distance. Given a collection of text pieces, we can use this property to find (retrieve) text pieces that are more similar to a given input using a vector database by sorting by distance between the embeddings. Memory banks use embedding models to store entries for similarity search — see Memory Banks → Modes for how embedding dimensions are configured.

General Memory

General memory lets agents store and retrieve standalone knowledge entries that don't follow a conversational structure. Unlike conversation memory, general memory has no speaker and uses an optional key for partitioning. The same three unified step types are available: Write Memory, Search Memory, and Load Memory. See Agent Steps → Memory.

Key (Memory)

A string that partitions entries within a memory bank. For conversation memory banks, the key is required and scopes entries by conversation thread (e.g. {{metadata.user_id}} for per-user memory, project-{{metadata.project_id}} for per-project memory). For general memory banks, the key is optional and groups related entries (e.g. entity-{{metadata.entity_type}}). All search and load operations are automatically scoped to the specified key. See Agent Steps → Key Concepts.

Evaluation

An evaluation is the result of screening a piece of content against a single governance policy. Each evaluation records the confidence score, verdict (pass, flag, or block), and an AI-generated explanation. Evaluations can be reviewed and resolved in the Governance review queue.

Governance

Governance is the subsystem that automatically screens agent outputs and content source items against configurable safety, privacy, and compliance policies. When content triggers a policy, it is flagged or blocked for human review. See Governance.

Governance Policy

A governance policy pairs a policy document (the rule text the AI evaluator checks for) with thresholds (flag and block) and a scope (account-wide, per-agent, per-step, or per-source). Policies can use built-in sample templates or fully custom text.

Insight

An insight step uses an LLM equipped with progressive-disclosure tools to analyze input content that may be too large for the model's context window. Instead of receiving the full input in the prompt, the model is given tools to check the content size, read byte-range slices, and search for patterns — allowing it to scan large documents, feeds, or data dumps incrementally and produce a summary, structured extraction, or other analysis. See Agent Steps → Insight.

Knowledge Base

A knowledge base is composed of one or more content sources, and can be searched for similarity. Agents can also be triggered when content is added or updated in a knowledge base.

Adding a content source to a knowledge base is instantaneous regardless of the amount of content in the content source. The same content source can be associated with multiple knowledge bases at the same time, allowing for different knowledge bases for different purposes with overlapping content without needing to duplicate import of the overlapping content. See Knowledge Bases for details.

LLM (Large Language Model)

Large Language Models are AI models that have been trained on a wide variety of information such that given an input, they will generate output that correspondingly answers questions, summarizes information, or creates new content. In addition to text input and output, some models are also capable of receiving and producing content in image, audio or video format. See Agent Steps → Prompt Call for how LLMs are configured in agent workflows.

MCP (Model Context Protocol)

The Model Context Protocol is an open standard and open-source framework introduced by Anthropic in November 2024 to standardize the way artificial intelligence systems like large language models integrate and share data with external tools, systems, and data sources. Seclai exposes an MCP server that AI coding assistants can use to manage agents, knowledge bases, memory banks, and more. See MCP Server for setup instructions.

Memory Bank

A memory bank gives agents persistent memory across runs. Each bank stores embedded entries that agents can write to, search, and load. There are two types: Conversation (dialogue turns with a speaker, partitioned by key) and General (standalone knowledge, optionally partitioned by key). Both types use the same unified step types — Write Memory, Search Memory, and Load Memory. Banks control their own embedding dimensions, compaction, and retention settings. See Memory Banks for full documentation.

Organization

Organizations allow agents, content sources and knowledge bases to be shared, managed, and used by a set of team members who can be added and removed from the organization by members with administrative rights. See Organizations for details.

Prompt Call

A prompt call sends input to an LLM model and returns the response for further processing or display. See Agent Steps → Prompt Call.

RAG (Retrieval Augmented Generation)

Retrieval augmented generation is a technique that makes it possible to supplement the information any LLM was trained on with additional context, such as information that did not exist or was not available when the model was trained--for example current news, or proprietary data. Adding additional context to a prompt call is an effective way to prevent the model from generating a response with outdated information or hallucinations.

The steps involved are:

  1. Add information as vector embeddings in a database.
  2. Retrieve content similar to a given input from the database.
  3. Optionally rerank the retrieved content.
  4. Pass the retrieved information to an LLM to combine the information into a coherent response.

See Agent Steps → Retrieval for how retrieval is configured and Knowledge Bases for managing the underlying data.

Reranking

Reranking improves the relevance and quality of retrieval results by re-evaluating and reordering the initial result set based on their relevance to the search input. See Agent Steps → Retrieval for how reranking is configured.

Retention

Retention controls how long memory bank entries are kept before automatic deletion. Options include preset durations (1 week, 1 month, 3 months, 1 year), a custom number of days, or indefinite retention. See Memory Banks → Retention.

RSS (Really Simple Syndication)

RSS is a data format that makes it easy to publish information about frequently updated content, such as news sites, blogs, and podcasts. See Sources for how to add RSS feeds.

Screening Point

A screening point is a location in the processing pipeline where governance evaluations are triggered. The four screening points are: source content (during content import), agent input (before processing), step output (after an agent step completes), and policy test (manual ad-hoc testing).

Solution

A solution is a high-level container that groups related agents, knowledge bases, memory banks, and content sources into a cohesive unit. Solutions make it easy to manage all the resources that work together to address a particular use case, and they support built-in AI assistants that can propose configuration changes following a propose-then-accept workflow. See Solutions for details.

Temperature

The temperature controls the randomness an LLM's response. Given the same input, the temperature 0.0 is expected to generate the most predictable output, whereas 1.0 is expected to generate the most varied output, and e.g. 0.3 being closer to 0.0 than 1.0 in randomness. See Agent Steps → Prompt Call for how temperature is configured.

Token

Large language models convert input into tokens through a process called tokenization. Some LLMs share the same tokenization algorithm, but they tend to differ from provider to provider, and even between different generations of models from the same provider too. Tokenizers typically break longer words into tokens for subwords. See Dashboard → Credit Usage for how token consumption maps to credits.

Tool Call

Increasingly more LLMs support a list of optional tools that the model may call while generating a response, such as searching the web, editing a file, or even running code.

Consider a prompt that asks what the latest Premier League results are. Without tool calls, the LLM will either say it doesn't know, or hallucinate an answer. If we pass in an option that enables the web search tool, the LLM will likely try to search for the answer and generate a summary based on the search result instead. See Agent Steps → Tools and Tools for available tools.

Vector

In math, a vector is a geometric object that points from point A to point B. In the context of RAG, point A is always implied to be the origin (0, ..., 0), while a given array of numbers always represent point B.

A vector database is a data storage that is very efficient at storing many millions of vector records, yet quickly return the vector records with the closest distance to a given vector.

See Knowledge Bases for how vector databases power retrieval in Seclai.

Verdict

A verdict is the outcome of a governance evaluation: pass (content proceeds normally), flag (content proceeds but is queued for human review), or block (content is withheld until resolved). The verdict is determined by comparing the evaluator's confidence score against the policy's flag and block thresholds.