Memory Guide
This guide explains how to implement long-term memory for agents using the semantic memory provided by the SDK's Memory module.
Overview
Semantic memory is a vector search-based memory system backed by Mem0 + Qdrant. It remembers user knowledge, preferences, and facts, and recalls relevant memories through similarity search.
Conversations with users
|
v
Semantic Memory (Mem0 + Qdrant)
"User is proficient in Python"
"Project X deadline is March"
"Last time requested CSV output"
|
v
Injected into LLM context -> More appropriate responses
All methods of SemanticMemoryClient (add / search / get_all / delete / delete_all) are synchronous methods. No await is needed. Only cleanup() is asynchronous.
Configuration
Configure LLM, Embedder, and vector store in the [memory] section of config.toml.
[memory.llm]
model = "azure_openai/gpt-4.1"
api_key = "${AZURE_OPENAI_API_KEY}"
base_url = "https://your-endpoint.openai.azure.com/"
api_version = "2024-12-01-preview"
[memory.embedder]
model = "azure_openai/text-embedding-3-large"
api_key = "${AZURE_OPENAI_API_KEY}"
base_url = "https://your-endpoint.openai.azure.com/"
api_version = "2024-12-01-preview"
[memory.vector_store]
provider = "qdrant"
[memory.vector_store.config]
url = "http://qdrant.internal:6333"
collection_name = "semantic_memory"
Basic Operations
from agenticstar_platform.memory import SemanticMemoryClient, SemanticMemoryConfig
# Initialize
config = SemanticMemoryConfig.from_toml("config.toml")
memory = SemanticMemoryClient(config)
# Add memory (messages is a list of dicts in OpenAI format)
result = memory.add(
messages=[
{"role": "user", "content": "I'm proficient in Python and TypeScript"},
{"role": "assistant", "content": "Understood. I'll record your skill set."},
],
user_id="user-001",
metadata={"source": "self_introduction"},
)
# Search memory (similarity-based)
result = memory.search(
query="What is this user's skill set?",
user_id="user-001",
limit=5,
)
for r in result.get("results", []):
print(f"{r['memory']}")
# Get all memories
result = memory.get_all(user_id="user-001")
# Delete a memory
result = memory.delete(memory_id="mem-xxx")
# Delete all memories for a user
result = memory.delete_all(user_id="user-001")
# Async cleanup (call on shutdown)
await memory.cleanup()
LLM Provider Selection
Specify the provider using the provider/model_name format in the model field of LLMProviderConfig.
- Azure OpenAI
- Anthropic
- AWS Bedrock
[memory.llm]
model = "azure_openai/gpt-4.1"
api_key = "${AZURE_OPENAI_API_KEY}"
base_url = "https://your-endpoint.openai.azure.com/"
api_version = "2024-12-01-preview"
[memory.llm]
model = "anthropic/claude-sonnet-4-6"
api_key = "${ANTHROPIC_API_KEY}"
[memory.llm]
model = "aws_bedrock/anthropic.claude-sonnet-4-6-20250514-v1:0"
aws_region_name = "us-east-1"
aws_access_key_id = "${AWS_ACCESS_KEY_ID}"
aws_secret_access_key = "${AWS_SECRET_ACCESS_KEY}"
Usage Patterns in Agents
Pattern 1: Context Enhancement
Retrieve context from memory and add it before calling the LLM.
def build_context(user_id: str, query: str) -> str:
"""Build context from memory"""
result = memory.search(query=query, user_id=user_id, limit=5)
context = "## About This User\n"
for r in result.get("results", []):
context += f"- {r['memory']}\n"
return context
# Inject context when calling LLM
context = build_context("user-001", user_message)
messages = [
{"role": "system", "content": f"Please refer to the following user information:\n{context}"},
{"role": "user", "content": user_message},
]
Pattern 2: Automatic Post-Conversation Memory
Save important information to memory after a conversation ends.
def save_conversation_memory(user_id: str, conversation: list[dict]):
"""Save conversation to memory"""
memory.add(
messages=conversation, # [{"role": "user", ...}, {"role": "assistant", ...}]
user_id=user_id,
metadata={"source": "conversation"},
)
Pattern 3: Per-User Personalization
def personalize_response(user_id: str, base_response: str) -> str:
"""Adjust response based on user preferences"""
result = memory.search(
query="User preferences and requests",
user_id=user_id,
limit=3,
)
memories = result.get("results", [])
if not memories:
return base_response
adjustment_prompt = "Please adjust the response considering the following user preferences:\n"
for m in memories:
adjustment_prompt += f"- {m['memory']}\n"
return adjust_with_llm(base_response, adjustment_prompt)
Utility Functions
The SDK provides utility functions for memory configuration conversion.
from agenticstar_platform.memory import (
normalize_provider,
parse_model_string,
convert_llm_to_mem0,
convert_embedder_to_mem0,
)
# Normalize provider name
provider = normalize_provider("Azure_OpenAI") # -> "azure_openai"
# Parse model string
provider, model = parse_model_string("azure_openai/gpt-4.1")
# -> ("azure_openai", "gpt-4.1")
# Convert LLMProviderConfig to Mem0 format
llm_config = convert_llm_to_mem0(llm_provider_config)
embedder_config = convert_embedder_to_mem0(embedder_config)
Next Steps
SDK API Reference — Memory Module
Complete specifications for SemanticMemoryClient
Security Guide
Content validation before saving to memory