CORE CONCEPTS
Memory & Context
Memory allows your agents to maintain state across conversations, remember user preferences, and access long-term knowledge. The SDK provides three memory layers: conversation history, session memory, and persistent memory.
Conversation History
By default, agents maintain conversation history within a single session using conversation IDs. Each turn is stored and included in subsequent requests so the agent has full context.
Conversation history
// Conversation history is automatic with conversationId
const turn1 = await client.agents.execute(agent.id, {
input: "My name is Sarah and I need help with billing",
});
const turn2 = await client.agents.execute(agent.id, {
input: "I was charged twice for my subscription",
conversationId: turn1.conversationId,
});
// Agent knows the user is Sarah and the topic is billing
// Retrieve conversation history
const history = await client.conversations.get(turn1.conversationId);
console.log(history.turns); // all messages in order
console.log(history.tokenCount); // total tokens used
Session Memory
Session memory stores structured data that persists for the duration of a conversation. Use it for user preferences, extracted data, and intermediate state that the agent needs to reference.
Session memory
const result = await client.agents.execute(agent.id, {
input: "I prefer email communication and my timezone is PST",
conversationId: "conv_abc123",
memory: {
session: {
// Key-value pairs available to the agent
userName: "Sarah Chen",
accountId: "ACC-789",
preferredContact: "email",
timezone: "America/Los_Angeles",
},
},
});
// Session memory is included in all subsequent turns
const result2 = await client.agents.execute(agent.id, {
input: "When will my refund arrive?",
conversationId: "conv_abc123",
// session memory carries forward automatically
});
Persistent Memory
Persistent memory survives across conversations and sessions. It acts as a knowledge base that the agent can read from and write to. This is ideal for storing user profiles, company knowledge, and learned preferences.
Persistent memory
// Create a memory store for the agent
const store = await client.memory.create({
agentId: agent.id,
name: "customer-profiles",
type: "key-value", // or "vector" for semantic search
});
// Write to persistent memory
await client.memory.set(store.id, {
key: "customer:sarah-chen",
value: {
name: "Sarah Chen",
accountId: "ACC-789",
preferredContact: "email",
pastIssues: ["billing-2024-01", "shipping-2024-03"],
satisfaction: "high",
},
ttl: 86400 * 90, // expire after 90 days
});
// Agent can read memory during execution
const result = await client.agents.execute(agent.id, {
input: "Hi, this is Sarah again",
memoryStores: [store.id], // agent can access this store
});
Vector Memory (Semantic Search)
For large knowledge bases, use vector memory to enable semantic search. The agent can search for relevant information using natural language queries rather than exact key lookups.
Vector memory
// Create a vector memory store
const knowledgeBase = await client.memory.create({
agentId: agent.id,
name: "product-knowledge",
type: "vector",
embeddingModel: "text-embedding-3-small",
});
// Add documents
await client.memory.addDocuments(knowledgeBase.id, [
{
content: "The Pro plan includes unlimited channels, priority support...",
metadata: { category: "pricing", plan: "pro" },
},
{
content: "Returns must be initiated within 30 days of delivery...",
metadata: { category: "policy", topic: "returns" },
},
// Add hundreds or thousands of documents
]);
// Agent automatically searches relevant context
const result = await client.agents.execute(agent.id, {
input: "What is your return policy?",
memoryStores: [knowledgeBase.id],
});
// Agent retrieves the returns policy document and uses it
Context Window Management
The SDK automatically manages the context window to stay within model limits. When conversations grow long, older turns are summarized to preserve the most important information.
Context management
const agent = await client.agents.create({
name: "Support Agent",
model: "claude-sonnet-4-20250514",
memory: {
enabled: true,
maxTurns: 50, // keep last 50 turns in full
summarize: true, // summarize older turns
summaryModel: "claude-haiku-4-20250514", // fast model for summaries
contextBudget: 0.8, // use 80% of context for history
},
});
Note
The contextBudget parameter controls how much of the model's context window is allocated to memory. The remaining space is reserved for the system prompt, tools, and the current user input.
Memory Events
Subscribe to memory events to track what your agent stores and retrieves:
Memory events
// Listen for memory operations
client.memory.on("write", (event) => {
console.log("Memory written:", event.key, event.store);
});
client.memory.on("search", (event) => {
console.log("Memory searched:", event.query, event.results.length);
});
client.memory.on("summarize", (event) => {
console.log("Conversation summarized:", event.conversationId);
console.log("Turns summarized:", event.turnCount);
console.log("Tokens saved:", event.tokensSaved);
});
Best Practices
- -Use session memory for ephemeral state like extracted form fields and user preferences within a conversation.
- -Use persistent memory for durable data like customer profiles and knowledge base articles.
- -Set TTLs on persistent memory to automatically clean up stale data.
- -Use vector memory for large knowledge bases with hundreds or thousands of documents.
- -Monitor token usage to optimize context window allocation for your use case.