Memory Steps
For common fields, string substitutions, metadata filters, caching, and execution order, see the Agent Steps Overview.
Memory
Memory gives agents the ability to remember information across runs. Unlike retrieval (which searches external knowledge bases), memory is a system-managed memory store that agents write to and read from automatically. Each memory bank controls its own embedding, compaction, and retention settings — create and manage them from the Memory Banks page.
How It Works
Each memory bank has its own content source that stores embedded entries — chunked, vectorized, and ready for similarity search — just like any other knowledge base content. The source is provisioned automatically when the bank is created.
Each memory entry is partitioned by a key (e.g. {{metadata.user_id}} for per-user memory, or a custom key for per-project or per-session scoping). This lets a single memory store serve many independent groupings without interference.
There are two memory bank types — but the same three unified step types work with both:
- Conversation banks — Designed for chat-style use cases. Entries are tagged with a
speaker("user"or"assistant"). Thespeakerfield is required when writing to a conversation bank. - General banks — Designed for facts, preferences, entity data, and standalone knowledge. No speaker concept — the
speakerfield is not used when writing to a general bank.
The bank type is determined automatically from the memory_bank_id — you don't need to choose different step types for different bank types.
The Three Memory Steps
| Step | Purpose | Best For |
|---|---|---|
| Add Memory | Write a new entry to a memory bank | Storing facts, preferences, decisions, summaries |
| Search Memory | Vector similarity search across entries | Recalling relevant past context before a prompt call |
| Load Memory | Load all entries chronologically | Summarization, timeline review, full-context prompting |
Use Search when you want the most semantically relevant entries for a query. Use Load when you need the complete history in order (e.g. to summarize or compact it).
Key Concepts
- Key — A string that partitions memory entries. Use
{{metadata.user_id}}for per-user memory,project-{{metadata.project_id}}for per-project memory, or any custom key. All search and load operations are automatically scoped to the specified key. - Speaker (conversation banks only) — Each entry records who said what (
"user"or"assistant"). Required when writing to a conversation bank; omitted for general banks. - Metadata — Optional key-value pairs attached to memory entries. Metadata values are stored alongside the entry and can be used for filtering in search steps.
- Memory banks — Create and manage memory banks from the Memory Banks page or via the Public API and MCP tools. Each bank controls its embedding model, compaction prompt, and retention settings.
Common Patterns
Remember and Recall — The most common pattern. Search memory at the start of a workflow to inject past context, generate a response, then extract and store new facts for the next run.
Step 1: Search Memory (key: "{{metadata.user_id}}", query: "{{input}}")
Step 2: Prompt Call — "Past context:\n{{input}}\n\nUser question: {{agent.input}}"
Step 3: Prompt Call — "Extract new facts from this exchange"
Step 4: Add Memory (key: "{{metadata.user_id}}", speaker: "assistant")
Step 5: Streaming Result
Progressive Summarization — Periodically load the full history, summarize it, and store the summary as a compacted entry. This keeps memory manageable as conversations grow.
Step 1: Load Memory (key: "{{metadata.user_id}}", order: "oldest_first")
Step 2: Prompt Call — "Summarize this conversation history into key points"
Step 3: Add Memory (key: "{{metadata.user_id}}", speaker: "assistant")
Preference Learning — Extract user preferences from conversations and persist them for future personalization.
Step 1: Insight — "Extract any user preferences from: {{input}}"
Step 2: Add Memory (key: "{{metadata.user_id}}")
Combined pattern (personalised chatbot — uses BOTH bank types):
The Add Memory step on a conversation bank already returns the conversation history as a messages array (see Conversation Bank Output Format), so there's no need for a Load Memory step after it — the history flows directly into the Prompt Call.
Step 1: Search Memory (general bank, key: "{{metadata.user_id}}") — recall preferences
Step 2: Add Memory (conversation bank, key: "{{metadata.user_id}}", speaker: "user") — save user msg → returns history
Step 3: Prompt Call — generate a personalised response using preferences + history
Step 4a: Streaming Result
Step 4b: Add Memory (conversation bank, key: "{{metadata.user_id}}", speaker: "assistant") — save reply
Step 4c: Add Memory (general bank, key: "{{metadata.user_id}}") — store new preferences
Memory vs. Retrieval
| Memory | Retrieval | |
|---|---|---|
| Data source | Agent-written entries (automatic) | External content from sources (RSS, websites, uploads) |
| Setup | None — content source provisioned at bank creation | Requires source + knowledge base configuration |
| Scoping | Partitioned by key | Filtered by knowledge base and metadata |
| Write method | Add Memory step | Source connections ingest content |
| Read method | Search Memory or Load Memory steps | Retrieval step |
| Best for | Per-user/session context, preferences, facts | Shared organizational knowledge, documentation |
Add Memory Step
Writes content to a memory bank. Each memory entry is embedded for vector search and partitioned by key, allowing agents to remember information across runs (e.g. user preferences, past decisions, accumulated facts). For conversation-type banks, the speaker field is also required.
Fields
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
memory_bank_id | UUID | Yes | — | The memory bank to write to. Select one from the Memory Banks page. |
key | string | Yes | — | A key that partitions memory entries (e.g. per user, per session). Supports substitutions — use {{metadata.user_id}} for per-user memory. |
speaker | string | No | null | Identifies who is "speaking" — "user" or "assistant". Required for conversation banks, omitted for general banks. Supports substitutions. |
content | string | No | null | The content to store in memory. If omitted, the step's input is used (equivalent to {{input}}). Supports substitutions. |
entry_metadata | object | No | null | Optional metadata object to attach to the entry. Stored as key-value pairs and available for filtering. |
Behavior
- The memory bank's content source is provisioned when the bank is created — no manual setup required.
- Memory entries are stored as content versions, chunked, and embedded for vector similarity search.
- The
keyis stored as metadata on each entry, allowing the search step to filter by key automatically. - For conversation banks, the
speakerfield is stored as metadata, enabling you to distinguish who said what when loading or searching conversation history. - The step's input is passed through as-is to child steps (the memory write is a side effect).
- Agent ID and agent run ID are automatically included in the entry's metadata for traceability.
Conversation Bank Output Format
When writing to a conversation bank, the Add Memory step returns the updated conversation history formatted as a JSON messages array — up to 100 entries in chronological order. The actual entries depend on the bank's compaction settings: compaction consolidates older entries into summaries, so the history you get back is the compacted version rather than raw individual messages:
[
{ "role": "user", "content": "What's the weather like?" },
{ "role": "assistant", "content": "It's sunny and 22°C today." },
{ "role": "user", "content": "Should I bring a jacket?" }
]
This format is directly compatible with a downstream Prompt Call that has use_step_input_as_messages: true (see Input as Messages). Each message is sent as a separate element in the LLM API call, which enables prompt caching on supported providers — the model caches the stable prefix of older messages and only processes the newest ones, reducing latency and cost on every turn.
Tip: For general banks, the step is a pass-through — the written content is returned as-is to child steps.
Use Case Examples
Save a conversation turn (conversation bank):
Step 1: Add Memory (key: "{{metadata.user_id}}", speaker: "user", content: "{{agent.input}}")
Chatbot pattern — add user message, respond, save assistant reply:
The most common conversational pattern. The Add Memory step saves the user's message and returns the full conversation history as a messages array. A Prompt Call with use_step_input_as_messages: true sends the history directly to the model (enabling prompt caching). The response is then delivered and saved in parallel.
Step 1: Add Memory (conversation bank, key: "{{metadata.user_id}}", speaker: "user", content: "{{agent.input}}")
Step 2: Prompt Call (use_step_input_as_messages: true, system_template: "You are a helpful assistant.")
Step 3a: Streaming Result
Step 3b: Add Memory (conversation bank, key: "{{metadata.user_id}}", speaker: "assistant")
The Add Memory step in step 1 returns the conversation history as:
[
{ "role": "user", "content": "previous message..." },
{ "role": "assistant", "content": "previous reply..." },
{ "role": "user", "content": "current message" }
]
This flows directly into the Prompt Call, which sends each message individually to the model. Older messages form a stable prefix that providers can cache — only the newest message needs full processing.
Store user preferences (general bank):
Step 1: Prompt Call — "Extract any user preferences from: {{input}}"
Step 2: Add Memory (key: "{{metadata.user_id}}", content: "{{input}}")
Accumulate facts across runs:
Step 1: Insight — Extract key facts from the input content
Step 2: Add Memory (key: "project-{{metadata.project_id}}", entry_metadata: {"type": "fact"})
Search Memory Step
Searches a memory bank using vector similarity search. Retrieves memory entries previously written by the Add Memory step, optionally scoped to a key. This allows agents to recall past context — such as user preferences, previous decisions, or accumulated knowledge — and use it in the current run.
Fields
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
memory_bank_id | UUID | Yes | — | The memory bank to search. Select one from the Memory Banks page. |
key | string | No | null | Optional key to scope the search (must match the key used when writing). For cross-key searches, omit this field. Supports substitutions. |
query | string | No | null | The query used to search memory entries. If not specified, the output from the previous step is used. Supports substitutions. |
top_n | integer | No | 10 | The number of top memory entries to retrieve (1–100). |
added_after | string | No | null | Time filter for entries added after this time. Supports time substitutions ({{1 week ago}}, {{yesterday}}). |
added_before | string | No | null | Time filter for entries added before this time. Supports time substitutions. |
filter | object | No | null | Optional metadata filter to further narrow results (e.g. by tags). Uses the same MongoDB-style filter syntax as retrieval steps. |
content_type | enum | No | application/json | Output format: application/json, text/html, text/plain, or application/xml (XML requests are coerced to JSON, matching retrieval). |
Behavior
- When a
keyis provided, the search is scoped by injecting{"key": {"$eq": "<key>"}}as a metadata filter. Any user-suppliedfilteris combined with this using$and. - Output format matches the retrieval step: JSON returns
{"matches": [...]}withcontent_title,content,content_url,content_publish_date, etc. Forapplication/xml, the response body is still JSON (coerced to match retrieval behavior). - If no memory entries exist yet (first run), the search returns an empty matches array.
- The memory bank's content source is provisioned when the bank is created.
Use Case Examples
Recall user preferences before generating a response:
Step 1: Search Memory (key: "{{metadata.user_id}}", query: "user preferences")
Step 2: Prompt Call — "Using these remembered preferences: {{input}}\n\nAnswer: {{agent.input}}"
Search memories with metadata filtering:
Step 1: Search Memory (
key: "project-{{metadata.project_id}}",
query: "{{input}}",
filter: {"category": {"$eq": "decision"}},
top_n: 20
)
Time-bounded memory recall:
Step 1: Search Memory (
key: "{{metadata.user_id}}",
query: "{{input}}",
added_after: "{{1 week ago}}",
top_n: 5
)
Cross-key search (no key):
Step 1: Search Memory (query: "{{input}}", top_n: 10)
Step 2: Prompt Call — "Based on this knowledge: {{input}}\n\nAnswer: {{agent.input}}"
Building a Memory-Augmented Agent
Combine the write and search steps to create agents with persistent memory:
| Pattern | Description |
|---|---|
| Remember & Recall | Search memory at the start, generate a response, then write new facts back to memory |
| Preference Learning | Extract user preferences from conversations and store them for future personalization |
| Progressive Summarization | Periodically summarize accumulated memories and store the summary as a new higher-level memory entry |
| Context Enrichment | Search memory to add relevant past context before a prompt call or retrieval step |
Example: Full memory-augmented Q&A agent:
Step 1: Search Memory (key: "{{metadata.user_id}}", query: "{{input}}")
Step 2: Prompt Call — "Past context:\n{{input}}\n\nUser question: {{agent.input}}\n\nAnswer the question, referencing past context where relevant."
Step 3: Prompt Call — "Extract any new facts or preferences from this exchange:\nQ: {{agent.input}}\nA: {{input}}"
Step 4: Add Memory (key: "{{metadata.user_id}}", content: "{{input}}")
Step 5: Streaming Result
This agent: searches memory for relevant past context, generates a response, extracts new facts from the exchange, and stores them for future runs.
Load Memory Step
Loads all memory entries for a key in chronological order. Unlike the Search Memory step (which performs vector similarity search), this step retrieves the full list of entries ordered by creation time. Use this when you need the complete entry set rather than the most semantically relevant matches.
Fields
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
memory_bank_id | UUID | Yes | — | The memory bank to load from. Select one from the Memory Banks page. |
key | string | Yes | — | The key used to scope the memory load (must match the key used when writing). Supports substitutions. |
order | enum | No | newest_first | newest_first returns most recent entries first; oldest_first returns earliest entries first. |
limit | integer | No | 100 | Maximum number of entries to load (1–1000). |
content_type | enum | No | application/json | Output format: application/json, text/html, text/plain, or application/xml. |
Behavior
- No vector search is performed — entries are retrieved by creation date in the specified order.
- The
keyis matched exactly against entry metadata. - JSON output returns
{"entries": [...]}with each entry containingcontentandcreated_at. - Text output returns entries separated by
---dividers. - If no entries exist for the given key, an empty result is returned.
- This is a composite step — it supports child steps that receive the loaded entries as input.
Conversation Bank Output Format
When loading from a conversation bank, the Load Memory step auto-detects conversation entries and formats the output as a JSON messages array — the same format used by Add Memory:
[
{ "role": "user", "content": "Hello" },
{ "role": "assistant", "content": "Hi! How can I help?" },
{ "role": "user", "content": "Tell me about prompt caching" }
]
This is directly compatible with a downstream Prompt Call that has use_step_input_as_messages: true (see Input as Messages). Use order: "oldest_first" so the oldest messages form a stable prefix that providers can cache — using newest_first would change the prefix on every turn, defeating the cache.
Tip: For general banks, the output is a JSON array of
[{"content", "title"}]objects.
Use Case Examples
Load full history for summarisation:
Step 1: Load Memory (key: "{{metadata.user_id}}", order: "oldest_first")
Step 2: Prompt Call — "Summarise the following conversation history: {{input}}"
Step 3: Add Memory (key: "{{metadata.user_id}}", speaker: "assistant")
Load recent entries for context:
Step 1: Load Memory (key: "{{metadata.user_id}}", order: "newest_first", limit: 10)
Step 2: Prompt Call — "Based on recent context: {{input}}\n\nAnswer: {{agent.input}}"
Export entries as plain text:
Step 1: Load Memory (
key: "project-{{metadata.project_id}}",
order: "oldest_first",
limit: 1000,
content_type: "text/plain"
)
Load all preferences before generating a personalised response:
Step 1: Load Memory (key: "{{metadata.user_id}}", order: "newest_first")
Step 2: Prompt Call — "Given these user preferences: {{input}}\n\nAnswer: {{agent.input}}"
Load vs Search: When to Use Each
| Use Case | Step to Use | Why |
|---|---|---|
| Find relevant past context | Search Memory | Vector similarity finds the most relevant entries |
| Summarise all interactions | Load Memory | Chronological order gives the full timeline |
| Answer a specific question | Search Memory | Semantic search finds matching context |
| Pre-load preferences or persona | Load Memory | Complete group contents needed, not just top match |
| Export or review history | Load Memory | Ordered, complete list of all entries |
| Inject the last N entries | Load Memory | Newest-first with a limit gives recent entries |