Memory
Memory lets the agent recall results from earlier steps or from previous runs.
The problem it solves
Each loop iteration the LLM only sees the current task description and tool results from that session. Without memory, an agent that ran a task last week cannot use what it learned — it starts from scratch every time.
Two situations where you need memory:
- Long multi-step tasks — the agent needs to recall findings from 10 steps ago
- Recurring agents — the agent runs daily and should remember past outcomes
Three backends
| Backend | Persists? | TTL | Setup | Use when |
|---|---|---|---|---|
InMemoryStore |
No — resets on exit | No | Zero | Development, testing |
ChromaMemory |
Yes — on disk | No | pip install 'gantrygraph[memory]' |
Across multiple runs |
MiniVecDbMemory |
No — in-process | Yes | pip install 'gantrygraph[minivecdb]' |
Long loops where old info should expire |
InMemoryStore
from gantrygraph import GantryEngine
from gantrygraph.memory import InMemoryStore
agent = GantryEngine(
llm=...,
memory=InMemoryStore(),
max_steps=30,
)
Uses trigram Jaccard similarity — zero dependencies, instant setup. All entries are lost when the process exits.
ChromaMemory
from gantrygraph.memory import ChromaMemory
agent = GantryEngine(
llm=...,
memory=ChromaMemory(
collection_name="my_agent",
persist_directory="/var/lib/agent/memory",
),
)
Uses sentence-transformer embeddings with ChromaDB.
The first run downloads the model (~90 MB); subsequent runs use the cache.
Pass persist_directory=None for an in-memory ChromaDB (no disk writes).
MiniVecDbMemory
from gantrygraph.memory import MiniVecDbMemory
from langchain_openai import OpenAIEmbeddings
embed = OpenAIEmbeddings(model="text-embedding-3-small").embed_query
agent = GantryEngine(
llm=...,
memory=MiniVecDbMemory(
embed_fn=embed,
ttl_ms=300_000, # entries expire after 5 minutes
),
)
Backed by a Rust HNSW engine (MiniVecDb) with 1-bit vector quantisation.
Uses 48 bytes per vector — 32× less RAM than ChromaDB's float32 approach.
The ttl_ms parameter makes entries expire automatically: ideal for long
navigation loops where information from early steps becomes stale or
misleading later.
How it works
Attaching a memory store adds two automatic steps to the agent loop:
| Step | When | What happens |
|---|---|---|
| Recall | Before the first think step | The engine searches the store with the task as the query, injects the top 3 matches as context |
| Store | After the run completes | The engine saves "Task: {task}\nResult: {result}" for future retrieval |
The agent does not call memory tools explicitly — everything is managed automatically.
Searching memory directly
You can also query the store from your own code:
results = await memory.search("authentication errors", k=3)
for r in results:
print(f"{r.score:.2f} {r.text}")
Results have a score between 0 and 1. Higher = more semantically similar.
Custom backend
Subclass BaseMemory and implement three methods:
from gantrygraph.memory.base import BaseMemory, MemoryResult
class RedisMemory(BaseMemory):
async def add(self, text: str, metadata: dict | None = None) -> None:
... # store text in Redis
async def search(self, query: str, k: int = 5) -> list[MemoryResult]:
... # return k most relevant entries
async def close(self) -> None:
... # clean up connections
Pass it directly to GantryEngine(memory=RedisMemory(...)).
See also: Memory guide · API reference