Back to Learn

Agent Memory Management

Best practices for managing agent memory at scale. Consolidation, pruning, privacy, and cost optimization strategies.

Why Memory Management Matters

Without proper memory management, your agent memory system will face these issues:

💸 Cost Explosion

Storage and embedding costs grow unbounded as users accumulate memory over months/years.

🐌 Slow Retrieval

Too many memories means slower searches and irrelevant results mixed with relevant ones.

🔒 Privacy Risk

Retaining user data indefinitely without deletion policies violates GDPR and user trust.

🔄 Contradictions

User preferences change over time - old memories can contradict new preferences.

1. Memory Consolidation
Combine related memories to reduce noise and improve retrieval

Instead of storing 50 memories from one conversation, consolidate them into summaries.

// Consolidate memories from same conversation
async function consolidateConversation(userId: string, conversationId: string) {
  // 1. Fetch all memories from conversation
  const memories = await fetchMemoriesByConversation(userId, conversationId)

  if (memories.length < 10) return // Not worth consolidating

  // 2. Use LLM to create summary
  const summary = await llm.generate({
    system: "Summarize these conversation memories into key takeaways.",
    user: memories.map(m => m.content).join('\n')
  })

  // 3. Store consolidated memory with high importance
  await storeMemory(userId, summary, {
    type: 'consolidated',
    importance: 8,
    sourceConversationId: conversationId,
    originalCount: memories.length
  })

  // 4. Delete individual memories
  await deleteMemories(memories.map(m => m.id))
}

// Run nightly for conversations older than 7 days
schedule.daily(() => {
  const oldConversations = getConversationsOlderThan(7, 'days')
  oldConversations.forEach(conv =>
    consolidateConversation(conv.userId, conv.id)
  )
})

Best Practice: Consolidate memories after conversations end or weekly for active users. Keep original memories for 30 days before deleting.

2. Memory Pruning
Remove old, low-value memories to keep the system performant

Delete memories that are old AND low-importance to prevent bloat.

// Pruning strategy based on age + importance
async function pruneMemories(userId: string) {
  const now = Date.now()

  // Define retention policies
  const policies = [
    {
      maxAge: 90 * 24 * 60 * 60 * 1000, // 90 days
      minImportance: 7 // Keep high-importance longer
    },
    {
      maxAge: 30 * 24 * 60 * 60 * 1000, // 30 days
      minImportance: 5 // Keep medium-importance
    },
    {
      maxAge: 7 * 24 * 60 * 60 * 1000,  // 7 days
      minImportance: 0 // Delete low-importance quickly
    }
  ]

  for (const policy of policies) {
    await deleteMemories({
      userId,
      timestamp: { $lt: now - policy.maxAge },
      importance: { $lt: policy.minImportance }
    })
  }

  // Always keep consolidated memories longer
  // They're already compressed and valuable
}

// Run weekly
schedule.weekly(() => {
  const activeUsers = getActiveUsers()
  activeUsers.forEach(user => pruneMemories(user.id))
})

Retention Guidelines:

High Importance (7-10)

Keep 90+ days

Medium Importance (4-6)

Keep 30 days

Low Importance (1-3)

Keep 7 days

3. Privacy & Compliance
GDPR, data deletion, and user control

Legal Requirement: GDPR requires you to delete user data on request within 30 days. Implement this from day one.

Required Features:

User Data Export

Let users download all their memories as JSON

Complete Data Deletion

Delete all memories, embeddings, and metadata for a user

Memory Transparency

Show users what memories the AI has stored about them

Selective Deletion

Let users delete individual memories they don't want stored

// Complete user data deletion
async function deleteAllUserData(userId: string) {
  // 1. Delete from vector database
  await vectorDB.delete({
    filter: { userId: { $eq: userId } }
  })

  // 2. Delete from relational database
  await db.memories.deleteMany({ userId })
  await db.conversations.deleteMany({ userId })

  // 3. Log for compliance audit trail
  await db.deletionLog.create({
    userId,
    timestamp: Date.now(),
    reason: 'user_request'
  })

  console.log(`Deleted all data for user: ${userId}`)
}
4. Cost Optimization
Reduce storage and embedding API costs

Storage Costs

  • Use cheaper storage tiers for old memories
  • Compress embeddings (quantization)
  • Archive cold memories to S3/GCS
  • Set per-user memory limits (e.g., max 10K memories)

Embedding Costs

  • Batch embed operations (100+ at once)
  • Don't re-embed unchanged content
  • Use smaller embedding models (ada vs text-embedding-3-large)
  • Cache embeddings for common queries

Cost Benchmark: Expect $0.01-0.10 per user per month for memory storage + embeddings at moderate usage. Optimize aggressively for high-volume apps.

5. Handling Contradictions
User preferences change - prioritize recent over old

What if a user says "I love Python" in January but "I prefer TypeScript now" in June?

Strategy: Recency Weighting

Always bias toward recent memories when contradictions exist. When retrieving preferences, sort by timestamp DESC.

Strategy: Explicit Updates

When user states a new preference, search for contradicting memories and mark them as "superseded" or delete them.

Strategy: Temporal Context

Include timestamps in memory text: "As of June 2025, user prefers TypeScript" so the AI knows it's current.

Production Memory Management Checklist