RAG Integration
Retrieval-Augmented Generation (RAG) allows agents to remember past conversations or access external knowledge bases.
SimpleRAG (SQLite)
The SimpleRAG middleware provides a lightweight, zero-dependency (other than standard libraries) solution using SQLite's FTS5 (Full-Text Search).
Features
- Keyword Search: Fast retrieval of past messages matching user queries.
- Context Injection: Automatically injects relevant history into the system prompt.
- Session Management: Groups messages by session IDs and generates titles.
Configuration
from agent_sdk.middleware import SimpleRAG
# Initialize
rag = SimpleRAG(
db_path="my_memory.db",
title_summary=True, # Use LLM to generate titles for sessions
summary_model="gpt-4o" # Optional: Use a cheaper model for titles (must be the same model type from client)
)
runner.use(rag)
ChromaRAG (Vector Database)
For more advanced, semantic understanding (e.g., knowing that "canine" relates to "dog"), use ChromaRAG.
Prerequisites
pip install chromadb
Configuration
from agent_sdk.middleware import ChromaRAG
# Initialize
chroma_rag = ChromaRAG(
persist_dir="./chroma_data",
collection_name="agent_memory"
)
runner.use(chroma_rag)
How it Works
- Before Run: The middleware takes the user's latest message, generates an embedding, and queries the database.
- Context: The most relevant documents are inserted into the agent's memory with a "system" role.
- After Run: The new conversation turn (User + AI response) is saved to the database.