Memory Guide

This guide explains how to implement long-term memory for agents using the semantic memory provided by the SDK's Memory module.

Overview

Semantic memory is a vector search-based memory system backed by Mem0 + Qdrant. It remembers user knowledge, preferences, and facts, and recalls relevant memories through similarity search.

Conversations with users
       |
       v
Semantic Memory (Mem0 + Qdrant)
  "User is proficient in Python"
  "Project X deadline is March"
  "Last time requested CSV output"
       |
       v
Injected into LLM context -> More appropriate responses

All methods of SemanticMemoryClient (add / search / get_all / delete / delete_all) are synchronous methods. No await is needed. Only cleanup() is asynchronous.

Configuration

Configure LLM, Embedder, and vector store in the [memory] section of config.toml.

config.tomlToml
[memory.llm]
model = "azure_openai/gpt-4.1"
api_key = "${AZURE_OPENAI_API_KEY}"
base_url = "https://your-endpoint.openai.azure.com/"
api_version = "2024-12-01-preview"

[memory.embedder]
model = "azure_openai/text-embedding-3-large"
api_key = "${AZURE_OPENAI_API_KEY}"
base_url = "https://your-endpoint.openai.azure.com/"
api_version = "2024-12-01-preview"

[memory.vector_store]
provider = "qdrant"

[memory.vector_store.config]
url = "http://qdrant.internal:6333"
collection_name = "semantic_memory"

Basic Operations

Python
from agenticstar_platform.memory import SemanticMemoryClient, SemanticMemoryConfig

# Initialize
config = SemanticMemoryConfig.from_toml("config.toml")
memory = SemanticMemoryClient(config)

# Add memory (messages is a list of dicts in OpenAI format)
result = memory.add(
    messages=[
        {"role": "user", "content": "I'm proficient in Python and TypeScript"},
        {"role": "assistant", "content": "Understood. I'll record your skill set."},
    ],
    user_id="user-001",
    metadata={"source": "self_introduction"},
)

# Search memory (similarity-based)
result = memory.search(
    query="What is this user's skill set?",
    user_id="user-001",
    limit=5,
)
for r in result.get("results", []):
    print(f"{r['memory']}")

# Get all memories
result = memory.get_all(user_id="user-001")

# Delete a memory
result = memory.delete(memory_id="mem-xxx")

# Delete all memories for a user
result = memory.delete_all(user_id="user-001")

# Async cleanup (call on shutdown)
await memory.cleanup()

LLM Provider Selection

Specify the provider using the provider/model_name format in the model field of LLMProviderConfig.

Azure OpenAI
Anthropic
AWS Bedrock

Toml
[memory.llm]
model = "azure_openai/gpt-4.1"
api_key = "${AZURE_OPENAI_API_KEY}"
base_url = "https://your-endpoint.openai.azure.com/"
api_version = "2024-12-01-preview"

Toml
[memory.llm]
model = "anthropic/claude-sonnet-4-6"
api_key = "${ANTHROPIC_API_KEY}"

Toml
[memory.llm]
model = "aws_bedrock/anthropic.claude-sonnet-4-6-20250514-v1:0"
aws_region_name = "us-east-1"
aws_access_key_id = "${AWS_ACCESS_KEY_ID}"
aws_secret_access_key = "${AWS_SECRET_ACCESS_KEY}"

Usage Patterns in Agents

Pattern 1: Context Enhancement

Retrieve context from memory and add it before calling the LLM.

Python
def build_context(user_id: str, query: str) -> str:
    """Build context from memory"""
    result = memory.search(query=query, user_id=user_id, limit=5)

    context = "## About This User\n"
    for r in result.get("results", []):
        context += f"- {r['memory']}\n"

    return context

# Inject context when calling LLM
context = build_context("user-001", user_message)
messages = [
    {"role": "system", "content": f"Please refer to the following user information:\n{context}"},
    {"role": "user", "content": user_message},
]

Pattern 2: Automatic Post-Conversation Memory

Save important information to memory after a conversation ends.

Python
def save_conversation_memory(user_id: str, conversation: list[dict]):
    """Save conversation to memory"""
    memory.add(
        messages=conversation,  # [{"role": "user", ...}, {"role": "assistant", ...}]
        user_id=user_id,
        metadata={"source": "conversation"},
    )

Pattern 3: Per-User Personalization

Python
def personalize_response(user_id: str, base_response: str) -> str:
    """Adjust response based on user preferences"""
    result = memory.search(
        query="User preferences and requests",
        user_id=user_id,
        limit=3,
    )
    memories = result.get("results", [])

    if not memories:
        return base_response

    adjustment_prompt = "Please adjust the response considering the following user preferences:\n"
    for m in memories:
        adjustment_prompt += f"- {m['memory']}\n"

    return adjust_with_llm(base_response, adjustment_prompt)

Utility Functions

The SDK provides utility functions for memory configuration conversion.

Python
from agenticstar_platform.memory import (
    normalize_provider,
    parse_model_string,
    convert_llm_to_mem0,
    convert_embedder_to_mem0,
)

# Normalize provider name
provider = normalize_provider("Azure_OpenAI")  # -> "azure_openai"

# Parse model string
provider, model = parse_model_string("azure_openai/gpt-4.1")
# -> ("azure_openai", "gpt-4.1")

# Convert LLMProviderConfig to Mem0 format
llm_config = convert_llm_to_mem0(llm_provider_config)
embedder_config = convert_embedder_to_mem0(embedder_config)

Next Steps

SDK API Reference — Memory Module

Complete specifications for SemanticMemoryClient

ガイドを見る

Security Guide

Content validation before saving to memory

ガイドを見る

Overview​

Configuration​

Basic Operations​

LLM Provider Selection​

Usage Patterns in Agents​

Pattern 1: Context Enhancement​

Pattern 2: Automatic Post-Conversation Memory​

Pattern 3: Per-User Personalization​

Utility Functions​

Next Steps​