Skip to main content

Memory Guide

This guide explains how to implement long-term memory for agents using the semantic memory provided by the SDK's Memory module.

Overview

Semantic memory is a vector search-based memory system backed by Mem0 + Qdrant. It remembers user knowledge, preferences, and facts, and recalls relevant memories through similarity search.

Conversations with users
|
v
Semantic Memory (Mem0 + Qdrant)
"User is proficient in Python"
"Project X deadline is March"
"Last time requested CSV output"
|
v
Injected into LLM context -> More appropriate responses

All methods of SemanticMemoryClient (add / search / get_all / delete / delete_all) are synchronous methods. No await is needed. Only cleanup() is asynchronous.

Configuration

Configure LLM, Embedder, and vector store in the [memory] section of config.toml.

config.tomlToml
[memory.llm]
model = "azure_openai/gpt-4.1"
api_key = "${AZURE_OPENAI_API_KEY}"
base_url = "https://your-endpoint.openai.azure.com/"
api_version = "2024-12-01-preview"

[memory.embedder]
model = "azure_openai/text-embedding-3-large"
api_key = "${AZURE_OPENAI_API_KEY}"
base_url = "https://your-endpoint.openai.azure.com/"
api_version = "2024-12-01-preview"

[memory.vector_store]
provider = "qdrant"

[memory.vector_store.config]
url = "http://qdrant.internal:6333"
collection_name = "semantic_memory"

Basic Operations

Python
from agenticstar_platform.memory import SemanticMemoryClient, SemanticMemoryConfig

# Initialize
config = SemanticMemoryConfig.from_toml("config.toml")
memory = SemanticMemoryClient(config)

# Add memory (messages is a list of dicts in OpenAI format)
result = memory.add(
messages=[
{"role": "user", "content": "I'm proficient in Python and TypeScript"},
{"role": "assistant", "content": "Understood. I'll record your skill set."},
],
user_id="user-001",
metadata={"source": "self_introduction"},
)

# Search memory (similarity-based)
result = memory.search(
query="What is this user's skill set?",
user_id="user-001",
limit=5,
)
for r in result.get("results", []):
print(f"{r['memory']}")

# Get all memories
result = memory.get_all(user_id="user-001")

# Delete a memory
result = memory.delete(memory_id="mem-xxx")

# Delete all memories for a user
result = memory.delete_all(user_id="user-001")

# Async cleanup (call on shutdown)
await memory.cleanup()

LLM Provider Selection

Specify the provider using the provider/model_name format in the model field of LLMProviderConfig.

Toml
[memory.llm]
model = "azure_openai/gpt-4.1"
api_key = "${AZURE_OPENAI_API_KEY}"
base_url = "https://your-endpoint.openai.azure.com/"
api_version = "2024-12-01-preview"

Usage Patterns in Agents

Pattern 1: Context Enhancement

Retrieve context from memory and add it before calling the LLM.

Python
def build_context(user_id: str, query: str) -> str:
"""Build context from memory"""
result = memory.search(query=query, user_id=user_id, limit=5)

context = "## About This User\n"
for r in result.get("results", []):
context += f"- {r['memory']}\n"

return context

# Inject context when calling LLM
context = build_context("user-001", user_message)
messages = [
{"role": "system", "content": f"Please refer to the following user information:\n{context}"},
{"role": "user", "content": user_message},
]

Pattern 2: Automatic Post-Conversation Memory

Save important information to memory after a conversation ends.

Python
def save_conversation_memory(user_id: str, conversation: list[dict]):
"""Save conversation to memory"""
memory.add(
messages=conversation, # [{"role": "user", ...}, {"role": "assistant", ...}]
user_id=user_id,
metadata={"source": "conversation"},
)

Pattern 3: Per-User Personalization

Python
def personalize_response(user_id: str, base_response: str) -> str:
"""Adjust response based on user preferences"""
result = memory.search(
query="User preferences and requests",
user_id=user_id,
limit=3,
)
memories = result.get("results", [])

if not memories:
return base_response

adjustment_prompt = "Please adjust the response considering the following user preferences:\n"
for m in memories:
adjustment_prompt += f"- {m['memory']}\n"

return adjust_with_llm(base_response, adjustment_prompt)

Utility Functions

The SDK provides utility functions for memory configuration conversion.

Python
from agenticstar_platform.memory import (
normalize_provider,
parse_model_string,
convert_llm_to_mem0,
convert_embedder_to_mem0,
)

# Normalize provider name
provider = normalize_provider("Azure_OpenAI") # -> "azure_openai"

# Parse model string
provider, model = parse_model_string("azure_openai/gpt-4.1")
# -> ("azure_openai", "gpt-4.1")

# Convert LLMProviderConfig to Mem0 format
llm_config = convert_llm_to_mem0(llm_provider_config)
embedder_config = convert_embedder_to_mem0(embedder_config)

Next Steps

SDK API Reference — Memory Module

Complete specifications for SemanticMemoryClient

ガイドを見る

Security Guide

Content validation before saving to memory

ガイドを見る