Documentation

Memory Steps

For common fields, string substitutions, metadata filters, caching, and execution order, see the Agent Steps Overview.


Memory

Memory gives agents the ability to remember information across runs. Unlike retrieval (which searches external knowledge bases), memory is a system-managed memory store that agents write to and read from automatically. Each memory bank controls its own embedding, compaction, and retention settings — create and manage them from the Memory Banks page.

How It Works

Each memory bank has its own content source that stores embedded entries — chunked, vectorized, and ready for similarity search — just like any other knowledge base content. The source is provisioned automatically when the bank is created.

Each memory entry is partitioned by a key (e.g. {{metadata.user_id}} for per-user memory, or a custom key for per-project or per-session scoping). This lets a single memory store serve many independent groupings without interference.

There are two memory bank types — but the same three unified step types work with both:

  • Conversation banks — Designed for chat-style use cases. Entries are tagged with a speaker ("user" or "assistant"). The speaker field is required when writing to a conversation bank.
  • General banks — Designed for facts, preferences, entity data, and standalone knowledge. No speaker concept — the speaker field is not used when writing to a general bank.

The bank type is determined automatically from the memory_bank_id — you don't need to choose different step types for different bank types.

The Three Memory Steps

StepPurposeBest For
Add MemoryWrite a new entry to a memory bankStoring facts, preferences, decisions, summaries
Search MemoryVector similarity search across entriesRecalling relevant past context before a prompt call
Load MemoryLoad all entries chronologicallySummarization, timeline review, full-context prompting

Use Search when you want the most semantically relevant entries for a query. Use Load when you need the complete history in order (e.g. to summarize or compact it).

Key Concepts

  • Key — A string that partitions memory entries. Use {{metadata.user_id}} for per-user memory, project-{{metadata.project_id}} for per-project memory, or any custom key. All search and load operations are automatically scoped to the specified key.
  • Speaker (conversation banks only) — Each entry records who said what ("user" or "assistant"). Required when writing to a conversation bank; omitted for general banks.
  • Metadata — Optional key-value pairs attached to memory entries. Metadata values are stored alongside the entry and can be used for filtering in search steps.
  • Memory banks — Create and manage memory banks from the Memory Banks page or via the Public API and MCP tools. Each bank controls its embedding model, compaction prompt, and retention settings.

Common Patterns

Remember and Recall — The most common pattern. Search memory at the start of a workflow to inject past context, generate a response, then extract and store new facts for the next run.

Step 1: Search Memory (key: "{{metadata.user_id}}", query: "{{input}}")
  Step 2: Prompt Call — "Past context:\n{{input}}\n\nUser question: {{agent.input}}"
    Step 3: Prompt Call — "Extract new facts from this exchange"
      Step 4: Add Memory (key: "{{metadata.user_id}}", speaker: "assistant")
    Step 5: Streaming Result
TriggerSearch Memoryrecall contextPrompt Callanswer questionExtract Factsprompt callStreaming ResultAdd Memorystore facts
Figure 1.Remember and Recall — the agent searches memory for context, generates a response, extracts new facts in parallel, and stores them back.

Progressive Summarization — Periodically load the full history, summarize it, and store the summary as a compacted entry. This keeps memory manageable as conversations grow.

Step 1: Load Memory (key: "{{metadata.user_id}}", order: "oldest_first")
  Step 2: Prompt Call — "Summarize this conversation history into key points"
    Step 3: Add Memory (key: "{{metadata.user_id}}", speaker: "assistant")
TriggerLoad Memoryoldest_firstPrompt Callsummarize historyAdd Memorysave summary
Figure 2.Progressive Summarization — the agent loads full history, summarizes it, and stores the summary as a new entry.

Preference Learning — Extract user preferences from conversations and persist them for future personalization.

Step 1: Insight — "Extract any user preferences from: {{input}}"
  Step 2: Add Memory (key: "{{metadata.user_id}}")
TriggerInsightextract factsAdd Memorytype: fact
Figure 3.Preference Learning — the agent extracts user preferences and stores them in memory for future runs.

Combined pattern (personalised chatbot — uses BOTH bank types):

The Add Memory step on a conversation bank already returns the conversation history as a messages array (see Conversation Bank Output Format), so there's no need for a Load Memory step after it — the history flows directly into the Prompt Call.

Step 1: Search Memory (general bank, key: "{{metadata.user_id}}") — recall preferences
  Step 2: Add Memory (conversation bank, key: "{{metadata.user_id}}", speaker: "user") — save user msg → returns history
    Step 3: Prompt Call — generate a personalised response using preferences + history
      Step 4a: Streaming Result
      Step 4b: Add Memory (conversation bank, key: "{{metadata.user_id}}", speaker: "assistant") — save reply
      Step 4c: Add Memory (general bank, key: "{{metadata.user_id}}") — store new preferences
TriggerSearch Memorygeneral: prefsAdd Memoryconv: user msgPrompt CallpersonalisedStreaming ResultAdd Memoryconv: assistantAdd Memorygeneral: new prefs
Figure 4.Combined pattern — the agent recalls preferences from a general bank, saves the user message to a conversation bank (which returns the history), generates a personalised response, then streams and persists in parallel.

Memory vs. Retrieval

MemoryRetrieval
Data sourceAgent-written entries (automatic)External content from sources (RSS, websites, uploads)
SetupNone — content source provisioned at bank creationRequires source + knowledge base configuration
ScopingPartitioned by keyFiltered by knowledge base and metadata
Write methodAdd Memory stepSource connections ingest content
Read methodSearch Memory or Load Memory stepsRetrieval step
Best forPer-user/session context, preferences, factsShared organizational knowledge, documentation

Add Memory Step

Writes content to a memory bank. Each memory entry is embedded for vector search and partitioned by key, allowing agents to remember information across runs (e.g. user preferences, past decisions, accumulated facts). For conversation-type banks, the speaker field is also required.

Fields

FieldTypeRequiredDefaultDescription
memory_bank_idUUIDYesThe memory bank to write to. Select one from the Memory Banks page.
keystringYesA key that partitions memory entries (e.g. per user, per session). Supports substitutions — use {{metadata.user_id}} for per-user memory.
speakerstringNonullIdentifies who is "speaking" — "user" or "assistant". Required for conversation banks, omitted for general banks. Supports substitutions.
contentstringNonullThe content to store in memory. If omitted, the step's input is used (equivalent to {{input}}). Supports substitutions.
entry_metadataobjectNonullOptional metadata object to attach to the entry. Stored as key-value pairs and available for filtering.

Behavior

  • The memory bank's content source is provisioned when the bank is created — no manual setup required.
  • Memory entries are stored as content versions, chunked, and embedded for vector similarity search.
  • The key is stored as metadata on each entry, allowing the search step to filter by key automatically.
  • For conversation banks, the speaker field is stored as metadata, enabling you to distinguish who said what when loading or searching conversation history.
  • The step's input is passed through as-is to child steps (the memory write is a side effect).
  • Agent ID and agent run ID are automatically included in the entry's metadata for traceability.

Conversation Bank Output Format

When writing to a conversation bank, the Add Memory step returns the updated conversation history formatted as a JSON messages array — up to 100 entries in chronological order. The actual entries depend on the bank's compaction settings: compaction consolidates older entries into summaries, so the history you get back is the compacted version rather than raw individual messages:

[
  { "role": "user", "content": "What's the weather like?" },
  { "role": "assistant", "content": "It's sunny and 22°C today." },
  { "role": "user", "content": "Should I bring a jacket?" }
]

This format is directly compatible with a downstream Prompt Call that has use_step_input_as_messages: true (see Input as Messages). Each message is sent as a separate element in the LLM API call, which enables prompt caching on supported providers — the model caches the stable prefix of older messages and only processes the newest ones, reducing latency and cost on every turn.

Tip: For general banks, the step is a pass-through — the written content is returned as-is to child steps.

Use Case Examples

Save a conversation turn (conversation bank):

Step 1: Add Memory (key: "{{metadata.user_id}}", speaker: "user", content: "{{agent.input}}")
TriggerAdd Memoryspeaker: user
Figure 5.Save conversation turn — the agent immediately stores the user's message in a conversation memory bank.

Chatbot pattern — add user message, respond, save assistant reply:

The most common conversational pattern. The Add Memory step saves the user's message and returns the full conversation history as a messages array. A Prompt Call with use_step_input_as_messages: true sends the history directly to the model (enabling prompt caching). The response is then delivered and saved in parallel.

Step 1: Add Memory (conversation bank, key: "{{metadata.user_id}}", speaker: "user", content: "{{agent.input}}")
  Step 2: Prompt Call (use_step_input_as_messages: true, system_template: "You are a helpful assistant.")
    Step 3a: Streaming Result
    Step 3b: Add Memory (conversation bank, key: "{{metadata.user_id}}", speaker: "assistant")

The Add Memory step in step 1 returns the conversation history as:

[
  { "role": "user", "content": "previous message..." },
  { "role": "assistant", "content": "previous reply..." },
  { "role": "user", "content": "current message" }
]

This flows directly into the Prompt Call, which sends each message individually to the model. Older messages form a stable prefix that providers can cache — only the newest message needs full processing.

TriggerAdd Memoryspeaker: userPrompt Callinput as messagesStreaming ResultAdd Memoryspeaker: assistant
Figure 6.Chatbot pattern — the agent saves the user message, sends the conversation history as messages to the model (enabling prompt caching), then streams the response and saves it in parallel.

Store user preferences (general bank):

Step 1: Prompt Call — "Extract any user preferences from: {{input}}"
Step 2: Add Memory (key: "{{metadata.user_id}}", content: "{{input}}")
TriggerPrompt Callextract preferencesAdd Memorygeneral bank
Figure 7.Extract preferences → Add Memory — the agent extracts user preferences and stores them in a general memory bank.

Accumulate facts across runs:

Step 1: Insight — Extract key facts from the input content
Step 2: Add Memory (key: "project-{{metadata.project_id}}", entry_metadata: {"type": "fact"})
TriggerInsightextract factsAdd Memorytype: fact
Figure 8.Insight → Add Memory (facts) — the agent extracts key facts and accumulates them in memory across runs.

Search Memory Step

Searches a memory bank using vector similarity search. Retrieves memory entries previously written by the Add Memory step, optionally scoped to a key. This allows agents to recall past context — such as user preferences, previous decisions, or accumulated knowledge — and use it in the current run.

Fields

FieldTypeRequiredDefaultDescription
memory_bank_idUUIDYesThe memory bank to search. Select one from the Memory Banks page.
keystringNonullOptional key to scope the search (must match the key used when writing). For cross-key searches, omit this field. Supports substitutions.
querystringNonullThe query used to search memory entries. If not specified, the output from the previous step is used. Supports substitutions.
top_nintegerNo10The number of top memory entries to retrieve (1–100).
added_afterstringNonullTime filter for entries added after this time. Supports time substitutions ({{1 week ago}}, {{yesterday}}).
added_beforestringNonullTime filter for entries added before this time. Supports time substitutions.
filterobjectNonullOptional metadata filter to further narrow results (e.g. by tags). Uses the same MongoDB-style filter syntax as retrieval steps.
content_typeenumNoapplication/jsonOutput format: application/json, text/html, text/plain, or application/xml (XML requests are coerced to JSON, matching retrieval).

Behavior

  • When a key is provided, the search is scoped by injecting {"key": {"$eq": "<key>"}} as a metadata filter. Any user-supplied filter is combined with this using $and.
  • Output format matches the retrieval step: JSON returns {"matches": [...]} with content_title, content, content_url, content_publish_date, etc. For application/xml, the response body is still JSON (coerced to match retrieval behavior).
  • If no memory entries exist yet (first run), the search returns an empty matches array.
  • The memory bank's content source is provisioned when the bank is created.

Use Case Examples

Recall user preferences before generating a response:

Step 1: Search Memory (key: "{{metadata.user_id}}", query: "user preferences")
  Step 2: Prompt Call — "Using these remembered preferences: {{input}}\n\nAnswer: {{agent.input}}"
TriggerSearch Memoryuser preferencesPrompt Callanswer with context
Figure 9.Search Memory → Prompt Call — the agent recalls user preferences from memory and uses them to personalise the response.

Search memories with metadata filtering:

Step 1: Search Memory (
  key: "project-{{metadata.project_id}}",
  query: "{{input}}",
  filter: {"category": {"$eq": "decision"}},
  top_n: 20
)
TriggerSearch Memoryfilter: decision
Figure 10.Search Memory (filtered) — the agent searches memory with a metadata filter to find only decision-type entries.

Time-bounded memory recall:

Step 1: Search Memory (
  key: "{{metadata.user_id}}",
  query: "{{input}}",
  added_after: "{{1 week ago}}",
  top_n: 5
)
TriggerSearch Memoryadded_after: 1w
Figure 11.Search Memory (time-bounded) — the agent searches only recent memory entries from the past week.

Cross-key search (no key):

Step 1: Search Memory (query: "{{input}}", top_n: 10)
  Step 2: Prompt Call — "Based on this knowledge: {{input}}\n\nAnswer: {{agent.input}}"
TriggerSearch Memoryno key (global)Prompt Callanswer question
Figure 12.Search Memory (cross-key) — the agent searches across all memory keys to find globally relevant context.

Building a Memory-Augmented Agent

Combine the write and search steps to create agents with persistent memory:

PatternDescription
Remember & RecallSearch memory at the start, generate a response, then write new facts back to memory
Preference LearningExtract user preferences from conversations and store them for future personalization
Progressive SummarizationPeriodically summarize accumulated memories and store the summary as a new higher-level memory entry
Context EnrichmentSearch memory to add relevant past context before a prompt call or retrieval step

Example: Full memory-augmented Q&A agent:

Step 1: Search Memory (key: "{{metadata.user_id}}", query: "{{input}}")
  Step 2: Prompt Call — "Past context:\n{{input}}\n\nUser question: {{agent.input}}\n\nAnswer the question, referencing past context where relevant."
    Step 3: Prompt Call — "Extract any new facts or preferences from this exchange:\nQ: {{agent.input}}\nA: {{input}}"
      Step 4: Add Memory (key: "{{metadata.user_id}}", content: "{{input}}")
    Step 5: Streaming Result

This agent: searches memory for relevant past context, generates a response, extracts new facts from the exchange, and stores them for future runs.

TriggerSearch Memoryrecall contextPrompt Callanswer questionExtract Factsprompt callStreaming ResultAdd Memorystore facts
Figure 13.Memory-augmented Q&A: the agent recalls context from memory, generates a response, extracts new facts, stores them back, and delivers the answer.

Load Memory Step

Loads all memory entries for a key in chronological order. Unlike the Search Memory step (which performs vector similarity search), this step retrieves the full list of entries ordered by creation time. Use this when you need the complete entry set rather than the most semantically relevant matches.

Fields

FieldTypeRequiredDefaultDescription
memory_bank_idUUIDYesThe memory bank to load from. Select one from the Memory Banks page.
keystringYesThe key used to scope the memory load (must match the key used when writing). Supports substitutions.
orderenumNonewest_firstnewest_first returns most recent entries first; oldest_first returns earliest entries first.
limitintegerNo100Maximum number of entries to load (1–1000).
content_typeenumNoapplication/jsonOutput format: application/json, text/html, text/plain, or application/xml.

Behavior

  • No vector search is performed — entries are retrieved by creation date in the specified order.
  • The key is matched exactly against entry metadata.
  • JSON output returns {"entries": [...]} with each entry containing content and created_at.
  • Text output returns entries separated by --- dividers.
  • If no entries exist for the given key, an empty result is returned.
  • This is a composite step — it supports child steps that receive the loaded entries as input.

Conversation Bank Output Format

When loading from a conversation bank, the Load Memory step auto-detects conversation entries and formats the output as a JSON messages array — the same format used by Add Memory:

[
  { "role": "user", "content": "Hello" },
  { "role": "assistant", "content": "Hi! How can I help?" },
  { "role": "user", "content": "Tell me about prompt caching" }
]

This is directly compatible with a downstream Prompt Call that has use_step_input_as_messages: true (see Input as Messages). Use order: "oldest_first" so the oldest messages form a stable prefix that providers can cache — using newest_first would change the prefix on every turn, defeating the cache.

Tip: For general banks, the output is a JSON array of [{"content", "title"}] objects.

Use Case Examples

Load full history for summarisation:

Step 1: Load Memory (key: "{{metadata.user_id}}", order: "oldest_first")
  Step 2: Prompt Call — "Summarise the following conversation history: {{input}}"
    Step 3: Add Memory (key: "{{metadata.user_id}}", speaker: "assistant")
TriggerLoad Memoryoldest_firstPrompt Callsummarize historyAdd Memorysave summary
Figure 14.Load Memory → Summarize → Add Memory — the agent loads full conversation history, summarizes it, and stores the summary.

Load recent entries for context:

Step 1: Load Memory (key: "{{metadata.user_id}}", order: "newest_first", limit: 10)
  Step 2: Prompt Call — "Based on recent context: {{input}}\n\nAnswer: {{agent.input}}"
TriggerLoad Memorynewest, limit: 10Prompt Callanswer with context
Figure 15.Load Memory (recent) → Prompt Call — the agent loads the 10 most recent entries and uses them as context.

Export entries as plain text:

Step 1: Load Memory (
  key: "project-{{metadata.project_id}}",
  order: "oldest_first",
  limit: 1000,
  content_type: "text/plain"
)
TriggerLoad Memorytext/plain, all
Figure 16.Load Memory (export) — the agent loads all entries as plain text for export or downstream processing.

Load all preferences before generating a personalised response:

Step 1: Load Memory (key: "{{metadata.user_id}}", order: "newest_first")
  Step 2: Prompt Call — "Given these user preferences: {{input}}\n\nAnswer: {{agent.input}}"
TriggerLoad Memorynewest_firstPrompt Callpersonalised reply
Figure 17.Load Memory (preferences) → Prompt Call — the agent loads all stored preferences to personalise the response.
Use CaseStep to UseWhy
Find relevant past contextSearch MemoryVector similarity finds the most relevant entries
Summarise all interactionsLoad MemoryChronological order gives the full timeline
Answer a specific questionSearch MemorySemantic search finds matching context
Pre-load preferences or personaLoad MemoryComplete group contents needed, not just top match
Export or review historyLoad MemoryOrdered, complete list of all entries
Inject the last N entriesLoad MemoryNewest-first with a limit gives recent entries

← Back to Agent Steps Overview