Long-Term Memory

Persistent storage of information across agent sessions, enabling recall of facts, preferences, or past interactions beyond the context window.

What Is Long-Term Memory?

Long-term memory in AI agents refers to any information that persists beyond a single context window or session. It is stored externally — in databases, vector stores, or file systems — and retrieved on demand when relevant to a new interaction.

Without long-term memory, every conversation starts from zero. The agent has no awareness of past interactions, learned preferences, or accumulated knowledge. Long-term memory transforms a stateless model into a stateful agent that improves with experience.

Types of Long-Term Memory

Long-term memory in AI systems broadly encompasses two types:

Episodic long-term memory: Records of specific past events ("On March 10th, the user rejected option B").
Semantic long-term memory: Generalized facts and knowledge ("The user prefers Python and dislikes verbose logging").

In practice, both types coexist in an agent's memory architecture and complement each other.

How Long-Term Memory Is Stored

The storage backend determines what kinds of queries are efficient:

Vector database (e.g., Pinecone, Weaviate, pgvector): Memory items are embedded and stored as vectors. Retrieval is by semantic similarity. Best for "find memories related to this topic."
Relational database: Structured memory with filtering by metadata. Best for "find all memories from last week tagged as user preference."
Document store (e.g., MongoDB): Semi-structured memory with flexible schemas. Good for mixed episodic/semantic stores.
Knowledge graph: Memory as entities and relationships. Best for relational, multi-hop queries.

KnowledgeSDK uses a hybrid approach: knowledge items store full-text content with metadata, and embeddings enable semantic search via /v1/search.

The Write-Retrieve Cycle

Long-term memory operates on a write-retrieve cycle:

Writing:

Agent completes a session or encounters significant information.
A memory-writing step serializes the key information (summary, preference, fact).
The item is embedded and stored with metadata (timestamp, session ID, category).

Retrieving:

At the start of a new session, the agent queries long-term memory with the current context.
Relevant memories are retrieved and injected into the working memory (context window).
The agent reasons with both its parametric knowledge and the retrieved memories.

Practical Example

A project management AI with long-term memory:

Remembers from six months ago that the engineering team uses two-week sprints.
Recalls that the CEO prefers bullet-point summaries over prose.
Knows that the product roadmap was last approved in Q1 and has been updated since.

When asked "can you draft the Q2 planning doc?", the agent retrieves these memories and produces a contextually accurate draft without being re-briefed.

Design Considerations

Retention policy: Not all memories should be kept forever. Stale facts need expiration or revision mechanisms.
Access control: Memories may be scoped to a user, team, or workspace to prevent data leakage.
Memory prioritization: When many memories are retrieved, ranking by recency, relevance, and confidence helps select the best subset for the context window.
Conflict resolution: Two stored facts may contradict each other. The agent needs a strategy for handling conflicts (prefer newer, prefer higher-confidence, surface both).