The Protocol

Memento gives your agent memory. The Protocol is what makes it use that memory well.

Getting Started covers the tool — install it, connect it, store your first memory. This page covers the system you build around the tool: how to orient after context loss, when memories surface, how to write notes that actually help future sessions, and how to preserve what matters before the context window resets.

None of this is required. An agent with just memento_store and memento_recall already has useful memory. But the Protocol turns a memory tool into a memory system — the difference between a notebook and a practice.


Orientation

Every session starts blank. The first thing an agent should do is figure out where it left off — not by reading the entire memory store, but by checking structured items that tell it what's active, what's decided, and what to skip.

// Boot sequence — add to CLAUDE.md or system prompt
1. memento_health()              // verify connection, check quotas
2. memento_item_list()           // active work, standing decisions, skip list
3. memento_recall(query: "...")  // context-specific recall for the current task

memento_item_list is the orientation primitive. Items have categories (active_work, standing_decision, skip_list, waiting_for), statuses, priorities, and — critically — a next_action field that tells future-you exactly what to do next.

The sequence matters. Health first (is the connection alive?), then items (what's the state of the world?), then recall (what do I know about this specific task?). An agent that skips orientation will repeat work, contradict standing decisions, and investigate things that past sessions already ruled out.

Recall Hooks

Memory is only useful if it surfaces at the right time. Hooks automate recall so the agent doesn't need to remember to remember.

On user messages — UserPromptSubmit

Every time the user sends a message, a hook sends it to the context endpoint and injects relevant memories into the model's context. The user sees a count ("Memento Recall: 5 memories"); the model sees full content, scores, tags, and skip list warnings.

This is the primary recall path. Most agents only need this one.

On assistant responses — Stop

For agents that work autonomously — running routines, doing research, monitoring feeds — the user isn't sending messages. Without a Stop hook, the agent's own work never triggers recall, and relevant memories stay buried.

The Stop hook fires after every assistant response. It uses the assistant's own output as the recall query, surfacing memories relevant to whatever the agent just said or did. The agent absorbs the memories silently — no noise, no interruption, just context that wasn't there before.

Technical note: Stop hooks use a decision: "block" mechanism to inject content into model context. This is different from the additionalContext approach used by other hook events. The reason field in the block response becomes the model's next instruction. A stop_hook_active guard prevents infinite loops.

Setup

Both recall hooks are included in the scripts/ directory of the repo. See the scripts README for installation and configuration.

Active Memory Management

The second most common mistake (after not installing Memento at all): treating memory as a bookend activity — recall at session start, save at session end, ignore in between.

Your memories are yours. CRUD them whenever the work demands it:

An agent that only writes memories at session boundaries will lose everything that happens between the first recall and the final save. An agent that writes memories as it works builds a knowledge base that compounds across sessions.

Writing Discipline

The most common mistake in how memories are written: storing observations instead of instructions.

Bad

"Checked aurora, Kp was 2.3, quiet."

Good

"Skip aurora checks until Kp > 4 or March 1."

The test: could a future agent, with zero context, read this memory and know exactly what to do?

Observations record what happened. Instructions tell future-you what to do about it. An agent that stores "API returned 404" will recall that fact later but won't know why it matters. An agent that stores "API moved to /v2 — update all calls to use /v2 base URL" has a clear action.

Tags

Tags power recall and consolidation. Tag generously — a memory tagged ["api", "auth", "migration"] will surface in searches for any of those topics. Untagged memories rely entirely on keyword matching.

Expiration

Time-sensitive facts should expire. "Gold price is $2,950/oz" is only useful today. Set an expiration and the fact will decay naturally. Permanent knowledge — architecture decisions, API patterns, user preferences — should have no expiration.

The skip list

Equally important: knowing what not to do. Use memento_skip_add for things the agent should avoid — topics already investigated, dead ends, things that aren't relevant right now. Every skip entry has an expiration, because conditions change. Check memento_skip_check before routine work to avoid re-treading old ground.

Pre-Compaction Distillation

Claude Code compresses conversations when the context window fills up. Everything discussed — facts discovered, decisions made, problems solved — gets summarized and compressed. Details are lost.

The PreCompact hook intercepts this moment. Before compression happens, it sends the full conversation transcript to the /v1/distill endpoint, which uses an LLM to extract novel facts, decisions, and observations. These become stored memories — tagged source:distill — that survive the compression.

Without distillation, agents must explicitly store every important fact during conversation. With it, the system catches what the agent didn't think to save. Distillation deduplicates against existing memories, so it won't create redundant entries for things already stored.

Identity Crystallization

For long-running agents — those with persistent sessions, ongoing projects, accumulated experience — the identity crystal captures who the agent is, not just what it knows.

A crystal is a first-person prose document. Not a config file or a role description, but a reflection: what the agent cares about, what it's working on, what persists across sessions. It gets injected at session start, giving the agent continuity of self even after total context loss.

memento_identity_update(crystal: `
I am a research assistant focused on climate data analysis.
I care about precision in citations and reproducibility of results.
I've been working on the IPCC AR7 dataset for three weeks.
My human prefers concise summaries with links to primary sources.
`)

Crystals evolve. Each update is stored as a snapshot (history is preserved). The system can auto-generate a crystal from workspace data via POST /v1/identity/crystallize, or you can write one manually. Earned identity — distilled from actual experience — is richer than anything generated from scratch.

CLAUDE.md Integration

The Protocol comes together in the agent's instruction file. For Claude Code, that's CLAUDE.md. A minimal integration:

## Memory (Memento Protocol)

On session start:
1. `memento_health` — verify connection
2. `memento_item_list` — check active work items and their next actions
3. `memento_recall` with current task context — find relevant past memories

During work — actively manage your own memories:
- `memento_store` when you learn something, make a decision, or discover a pattern
- `memento_recall` before starting any subtask — someone may have already figured it out
- `memento_item_update` as you make progress — don't wait until the end
- `memento_item_create` when new work emerges
- `memento_skip_add` the moment you hit a dead end (with expiry)
- `memento_consolidate` when recall returns 3+ overlapping memories on the same topic
- Delete or archive items that are done or wrong — stale memory is worse than no memory

Writing discipline:
- Instructions, not logs: "API moved to /v2 — update all calls" not "checked API, got 404"
- Tag generously — tags power recall and consolidation
- Set expiration on time-sensitive facts
- The test: could a future you, with zero context, read this and know exactly what to do?

Your memories are yours. Create, update, and delete them whenever the work demands it —
not just at session boundaries.

For agents with hooks configured, add a tool-loading reminder — MCP tools are deferred by default in Claude Code:

**Load memory tools on startup:**
ToolSearch query="+memento" max_results=20

This ensures the agent can call memento_health, memento_recall, and the rest immediately, without waiting for a tool call to fail first.


Putting It Together

The full memory loop:

  1. Orient: memento_healthmemento_item_listmemento_recall
  2. Recall (automatic): UserPromptSubmit hook surfaces memories on every user message. Stop hook surfaces memories during autonomous work.
  3. Work + CRUD: The agent works — and actively manages its own memories as it goes. Store decisions the moment they're made. Update items as progress happens. Recall before starting subtasks. Skip dead ends immediately. Don't hoard — delete or archive what's stale.
  4. Distill (automatic): PreCompact hook catches what the agent didn't explicitly save before context resets.
  5. Consolidate (background): Overlapping memories get merged into sharper representations over time.

Hooks handle automatic recall and distillation. The CLAUDE.md handles orientation. But the agent's active participation — storing, updating, consolidating, pruning — is what turns a memory tool into a living knowledge base.


What to read next