A comprehensive solution accelerator demonstrating how to build stateful AI agents using Databricks Lakebase (PostgreSQL) and LangGraph for persistent conversation memory and state management.
This accelerator showcases how to build conversational AI agents that maintain context across multiple interactions using Databricks Lakebase as a checkpoint store. Unlike stateless LLM calls, these agents preserve conversation history and can resume from any point in time using thread IDs.
- Persistent Memory: Conversation state stored in Databricks Lakebase (PostgreSQL)
- Thread-based Sessions: Each conversation tracked with unique thread IDs
- Resumable Conversations: Pick up where you left off in any conversation
- Unity Catalog Integration: Leverage UC functions as agent tools
- Production-ready Deployment: Complete MLflow model registration and serving
- Interactive Chat Interface: Streamlit-based web application
The solution uses:
- Lakebase: Managed PostgreSQL for durable agent state storage
- LangGraph: State graph framework with PostgreSQL checkpointer
- MLflow: Model tracking, registration, and deployment
- Unity Catalog: Function toolkit for agent tools
- Databricks Model Serving: Production deployment platform
lakebase-memory-accelerator/
├── README.md # This file
├── 00-data-uc-function-setup.ipynb # Unity Catalog functions setup
├── 01-lakebase-instance-setup.ipynb # Lakebase PostgreSQL instance creation
├── 02-lakebase-langgraph-checkpointer-agent.ipynb # Main agent implementation
├── agent.py # LangGraph agent class implementation
├── checkpoints-example-query.dbquery.ipynb # Example checkpoint queries
├── data/ # Sample datasets
│ ├── cyber_threat_detection.snappy.parquet # Cybersecurity threat data
│ └── user_info.snappy.parquet # User information data
├── databricks_apps/ # Streamlit web application
│ ├── LICENSE
│ ├── NOTICE
│ ├── README.md
│ └── streamlit-chatbot-app/
│ ├── app.py # Streamlit chat interface
│ ├── app.yaml # App configuration
│ ├── model_serving_utils.py # Model serving utilities
│ └── requirements.txt # Python dependencies
└── resources/ # Databricks bundle configurations
├── lakebase_instance.yml # Example DABs Lakebase instance config
├── short_term_memory_agent_job.yml # Example DABs Job deployment config
└── short_term_memory_app.yml # Example DABs App deployment config
- Databricks Workspace with Unity Catalog enabled
- Lakebase Instance - Create via SQL Warehouses → Lakebase Postgres → Create database instance
- Model Serving Permissions for agent deployment
- Secret Scope for storing credentials (default:
dbdemos)
Run 00-data-uc-function-setup.ipynb to:
- Create sample datasets in Unity Catalog
- Set up Unity Catalog functions as agent tools
- Configure cybersecurity threat detection functions
Run 01-lakebase-instance-setup.ipynb to:
- Create Lakebase PostgreSQL instance
- Configure database roles and permissions
- Set up database catalog integration
Run 02-lakebase-langgraph-checkpointer-agent.ipynb to:
- Build the stateful LangGraph agent
- Configure PostgreSQL checkpointer
- Test agent locally with conversation threads
- Register model to Unity Catalog
- Deploy to Databricks Model Serving
Deploy the Streamlit chat interface:
- Configure thread ID management in sidebar
- Connect to deployed agent endpoint
- Enable persistent conversation sessions
The main agent implementation (agent.py) features:
- PostgreSQL Connection Pool: Efficient database connection management
- OAuth Token Refresh: Automatic credential rotation for Lakebase
- Conversation Checkpointing: State persistence after each agent step
- Tool Integration: Unity Catalog functions and vector search tools
config = {
"llm_model_serving_endpoint_name": "databricks-claude-3-7-sonnet",
"llm_prompt_template": "Cybersecurity assistant prompt...",
"conn_db_name": "databricks_postgres",
"conn_ssl_mode": "require",
"conn_host": "your-lakebase-instance.database.cloud.databricks.com",
"instance_name": "your-lakebase-instance-name"
}get_cyber_threat_info: Retrieve cybersecurity threat informationget_user_info: Get user details from threat source IPs- Optional: Vector Search retrieval tools
from agent import AGENT
response = AGENT.predict({
"messages": [{"role": "user", "content": "Who committed the latest malware threat?"}],
"custom_inputs": {"thread_id": "conversation-123"}
})# Continue previous conversation using same thread_id
response = AGENT.predict({
"messages": [{"role": "user", "content": "What was their IP address?"}],
"custom_inputs": {"thread_id": "conversation-123"} # Same thread ID
})- Open the deployed Databricks App
- Configure thread ID in sidebar (auto-generated or custom)
- Start conversation with cybersecurity queries
- Agent maintains context across multiple messages
- Automatic scaling and high availability
- Built-in authentication and authorization
- Integrated monitoring and logging
- Interactive web interface
- Custom thread ID management
- Real-time conversation experience
- Automated agent training/updates
- Batch processing capabilities
- Resource optimization
Use checkpoints-example-query.dbquery.ipynb to:
- Analyze conversation patterns
- Debug agent behavior
- Monitor checkpoint storage
- Model versioning and lineage
- Performance metrics
- Experiment comparison
- Unity Catalog Integration: Data governance and permissions
- OAuth Authentication: Secure Lakebase connections
- Secret Management: Encrypted credential storage
- Audit Logging: Complete conversation tracking
- Create Unity Catalog functions
- Add to
uc_tool_nameslist inagent.py - Update system prompt to include tool usage
- Update
llm_prompt_templatein configuration - Adjust tool selection logic
- Customize conversation flow in LangGraph
- Adjust connection pool sizes (
DB_POOL_MIN_SIZE,DB_POOL_MAX_SIZE) - Configure model serving autoscaling
- Optimize checkpoint storage patterns
- Lakebase Connection: Ensure proper OAuth credentials and instance status
- Thread Management: Verify thread_id persistence in application state
- Tool Permissions: Check Unity Catalog function access rights
- Model Serving: Validate endpoint deployment and health
- MLflow experiment tracking for model behavior
- Lakebase query logs for connection issues
- Databricks job logs for deployment problems
- Production Hardening: Implement monitoring, alerting, and backup strategies
- Advanced Tools: Add vector search, external APIs, or custom functions
- Multi-tenant Support: Implement user-specific thread isolation
- Performance Optimization: Fine-tune connection pooling and caching