Agent Memory Management
Best practices for managing agent memory at scale. Consolidation, pruning, privacy, and cost optimization strategies.
Without proper memory management, your agent memory system will face these issues:
💸 Cost Explosion
Storage and embedding costs grow unbounded as users accumulate memory over months/years.
🐌 Slow Retrieval
Too many memories means slower searches and irrelevant results mixed with relevant ones.
🔒 Privacy Risk
Retaining user data indefinitely without deletion policies violates GDPR and user trust.
🔄 Contradictions
User preferences change over time - old memories can contradict new preferences.
Instead of storing 50 memories from one conversation, consolidate them into summaries.
// Consolidate memories from same conversation
async function consolidateConversation(userId: string, conversationId: string) {
// 1. Fetch all memories from conversation
const memories = await fetchMemoriesByConversation(userId, conversationId)
if (memories.length < 10) return // Not worth consolidating
// 2. Use LLM to create summary
const summary = await llm.generate({
system: "Summarize these conversation memories into key takeaways.",
user: memories.map(m => m.content).join('\n')
})
// 3. Store consolidated memory with high importance
await storeMemory(userId, summary, {
type: 'consolidated',
importance: 8,
sourceConversationId: conversationId,
originalCount: memories.length
})
// 4. Delete individual memories
await deleteMemories(memories.map(m => m.id))
}
// Run nightly for conversations older than 7 days
schedule.daily(() => {
const oldConversations = getConversationsOlderThan(7, 'days')
oldConversations.forEach(conv =>
consolidateConversation(conv.userId, conv.id)
)
})Best Practice: Consolidate memories after conversations end or weekly for active users. Keep original memories for 30 days before deleting.
Delete memories that are old AND low-importance to prevent bloat.
// Pruning strategy based on age + importance
async function pruneMemories(userId: string) {
const now = Date.now()
// Define retention policies
const policies = [
{
maxAge: 90 * 24 * 60 * 60 * 1000, // 90 days
minImportance: 7 // Keep high-importance longer
},
{
maxAge: 30 * 24 * 60 * 60 * 1000, // 30 days
minImportance: 5 // Keep medium-importance
},
{
maxAge: 7 * 24 * 60 * 60 * 1000, // 7 days
minImportance: 0 // Delete low-importance quickly
}
]
for (const policy of policies) {
await deleteMemories({
userId,
timestamp: { $lt: now - policy.maxAge },
importance: { $lt: policy.minImportance }
})
}
// Always keep consolidated memories longer
// They're already compressed and valuable
}
// Run weekly
schedule.weekly(() => {
const activeUsers = getActiveUsers()
activeUsers.forEach(user => pruneMemories(user.id))
})Retention Guidelines:
Keep 90+ days
Keep 30 days
Keep 7 days
Legal Requirement: GDPR requires you to delete user data on request within 30 days. Implement this from day one.
Required Features:
User Data Export
Let users download all their memories as JSON
Complete Data Deletion
Delete all memories, embeddings, and metadata for a user
Memory Transparency
Show users what memories the AI has stored about them
Selective Deletion
Let users delete individual memories they don't want stored
// Complete user data deletion
async function deleteAllUserData(userId: string) {
// 1. Delete from vector database
await vectorDB.delete({
filter: { userId: { $eq: userId } }
})
// 2. Delete from relational database
await db.memories.deleteMany({ userId })
await db.conversations.deleteMany({ userId })
// 3. Log for compliance audit trail
await db.deletionLog.create({
userId,
timestamp: Date.now(),
reason: 'user_request'
})
console.log(`Deleted all data for user: ${userId}`)
}Storage Costs
- ✓Use cheaper storage tiers for old memories
- ✓Compress embeddings (quantization)
- ✓Archive cold memories to S3/GCS
- ✓Set per-user memory limits (e.g., max 10K memories)
Embedding Costs
- ✓Batch embed operations (100+ at once)
- ✓Don't re-embed unchanged content
- ✓Use smaller embedding models (ada vs text-embedding-3-large)
- ✓Cache embeddings for common queries
Cost Benchmark: Expect $0.01-0.10 per user per month for memory storage + embeddings at moderate usage. Optimize aggressively for high-volume apps.
What if a user says "I love Python" in January but "I prefer TypeScript now" in June?
Strategy: Recency Weighting
Always bias toward recent memories when contradictions exist. When retrieving preferences, sort by timestamp DESC.
Strategy: Explicit Updates
When user states a new preference, search for contradicting memories and mark them as "superseded" or delete them.
Strategy: Temporal Context
Include timestamps in memory text: "As of June 2025, user prefers TypeScript" so the AI knows it's current.