Lakebase Memory Accelerator

A comprehensive solution accelerator demonstrating how to build stateful AI agents using Databricks Lakebase (PostgreSQL) and LangGraph for persistent conversation memory and state management.

Overview

This accelerator showcases how to build conversational AI agents that maintain context across multiple interactions using Databricks Lakebase as a checkpoint store. Unlike stateless LLM calls, these agents preserve conversation history and can resume from any point in time using thread IDs.

Key Features

Persistent Memory: Conversation state stored in Databricks Lakebase (PostgreSQL)
Thread-based Sessions: Each conversation tracked with unique thread IDs
Resumable Conversations: Pick up where you left off in any conversation
Unity Catalog Integration: Leverage UC functions as agent tools
Production-ready Deployment: Complete MLflow model registration and serving
Interactive Chat Interface: Streamlit-based web application

Architecture

The solution uses:

Lakebase: Managed PostgreSQL for durable agent state storage
LangGraph: State graph framework with PostgreSQL checkpointer
MLflow: Model tracking, registration, and deployment
Unity Catalog: Function toolkit for agent tools
Databricks Model Serving: Production deployment platform

Directory Structure

lakebase-memory-accelerator/
├── README.md                                    # This file
├── 00-data-uc-function-setup.ipynb            # Unity Catalog functions setup
├── 01-lakebase-instance-setup.ipynb           # Lakebase PostgreSQL instance creation
├── 02-lakebase-langgraph-checkpointer-agent.ipynb  # Main agent implementation
├── agent.py                                   # LangGraph agent class implementation
├── checkpoints-example-query.dbquery.ipynb   # Example checkpoint queries
├── data/                                      # Sample datasets
│   ├── cyber_threat_detection.snappy.parquet # Cybersecurity threat data
│   └── user_info.snappy.parquet              # User information data
├── databricks_apps/                          # Streamlit web application
│   ├── LICENSE
│   ├── NOTICE  
│   ├── README.md
│   └── streamlit-chatbot-app/
│       ├── app.py                            # Streamlit chat interface
│       ├── app.yaml                          # App configuration
│       ├── model_serving_utils.py            # Model serving utilities
│       └── requirements.txt                  # Python dependencies
└── resources/                                # Databricks bundle configurations
    ├── lakebase_instance.yml                 # Example DABs Lakebase instance config
    ├── short_term_memory_agent_job.yml       # Example DABs Job deployment config
    └── short_term_memory_app.yml             # Example DABs App deployment config

Getting Started

Prerequisites

Databricks Workspace with Unity Catalog enabled
Lakebase Instance - Create via SQL Warehouses → Lakebase Postgres → Create database instance
Model Serving Permissions for agent deployment
Secret Scope for storing credentials (default: dbdemos)

Setup Instructions

Step 1: Data and Functions Setup

Run 00-data-uc-function-setup.ipynb to:

Create sample datasets in Unity Catalog
Set up Unity Catalog functions as agent tools
Configure cybersecurity threat detection functions

Step 2: Lakebase Instance Setup

Run 01-lakebase-instance-setup.ipynb to:

Create Lakebase PostgreSQL instance
Configure database roles and permissions
Set up database catalog integration

Step 3: Agent Development and Deployment

Run 02-lakebase-langgraph-checkpointer-agent.ipynb to:

Build the stateful LangGraph agent
Configure PostgreSQL checkpointer
Test agent locally with conversation threads
Register model to Unity Catalog
Deploy to Databricks Model Serving

Step 4: Web Application Deployment

Deploy the Streamlit chat interface:

Configure thread ID management in sidebar
Connect to deployed agent endpoint
Enable persistent conversation sessions

Core Components

LangGraphChatAgent Class

The main agent implementation (agent.py) features:

PostgreSQL Connection Pool: Efficient database connection management
OAuth Token Refresh: Automatic credential rotation for Lakebase
Conversation Checkpointing: State persistence after each agent step
Tool Integration: Unity Catalog functions and vector search tools

Key Configuration

config = {
    "llm_model_serving_endpoint_name": "databricks-claude-3-7-sonnet",
    "llm_prompt_template": "Cybersecurity assistant prompt...",
    "conn_db_name": "databricks_postgres",
    "conn_ssl_mode": "require",
    "conn_host": "your-lakebase-instance.database.cloud.databricks.com",
    "instance_name": "your-lakebase-instance-name"
}

Available Tools

get_cyber_threat_info: Retrieve cybersecurity threat information
get_user_info: Get user details from threat source IPs
Optional: Vector Search retrieval tools

Usage Examples

Basic Agent Interaction

from agent import AGENT

response = AGENT.predict({
    "messages": [{"role": "user", "content": "Who committed the latest malware threat?"}],
    "custom_inputs": {"thread_id": "conversation-123"}
})

Resuming Conversations

# Continue previous conversation using same thread_id
response = AGENT.predict({
    "messages": [{"role": "user", "content": "What was their IP address?"}],
    "custom_inputs": {"thread_id": "conversation-123"}  # Same thread ID
})

Streamlit App Usage

Open the deployed Databricks App
Configure thread ID in sidebar (auto-generated or custom)
Start conversation with cybersecurity queries
Agent maintains context across multiple messages

Deployment Options

1. Databricks Model Serving

Automatic scaling and high availability
Built-in authentication and authorization
Integrated monitoring and logging

2. Databricks Apps

Interactive web interface
Custom thread ID management
Real-time conversation experience

3. Job Scheduling

Automated agent training/updates
Batch processing capabilities
Resource optimization

Monitoring and Observability

Conversation Queries

Use checkpoints-example-query.dbquery.ipynb to:

Analyze conversation patterns
Debug agent behavior
Monitor checkpoint storage

MLflow Tracking

Model versioning and lineage
Performance metrics
Experiment comparison

Security and Governance

Unity Catalog Integration: Data governance and permissions
OAuth Authentication: Secure Lakebase connections
Secret Management: Encrypted credential storage
Audit Logging: Complete conversation tracking

Customization

Adding New Tools

Create Unity Catalog functions
Add to uc_tool_names list in agent.py
Update system prompt to include tool usage

Modifying Agent Behavior

Update llm_prompt_template in configuration
Adjust tool selection logic
Customize conversation flow in LangGraph

Scaling Configuration

Adjust connection pool sizes (DB_POOL_MIN_SIZE, DB_POOL_MAX_SIZE)
Configure model serving autoscaling
Optimize checkpoint storage patterns

Troubleshooting

Common Issues

Lakebase Connection: Ensure proper OAuth credentials and instance status
Thread Management: Verify thread_id persistence in application state
Tool Permissions: Check Unity Catalog function access rights
Model Serving: Validate endpoint deployment and health

Debug Resources

MLflow experiment tracking for model behavior
Lakebase query logs for connection issues
Databricks job logs for deployment problems

Next Steps

Production Hardening: Implement monitoring, alerting, and backup strategies
Advanced Tools: Add vector search, external APIs, or custom functions
Multi-tenant Support: Implement user-specific thread isolation
Performance Optimization: Fine-tune connection pooling and caching

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
data		data
databricks_apps		databricks_apps
resources		resources
.gitignore		.gitignore
00-data-uc-function-setup.ipynb		00-data-uc-function-setup.ipynb
01-lakebase-instance-setup.ipynb		01-lakebase-instance-setup.ipynb
02-lakebase-langgraph-checkpointer-agent.ipynb		02-lakebase-langgraph-checkpointer-agent.ipynb
03-deploy-run-databricks-app.ipynb		03-deploy-run-databricks-app.ipynb
LICENSE.md		LICENSE.md
README.md		README.md
agent.py		agent.py
checkpoints-example-query.dbquery.ipynb		checkpoints-example-query.dbquery.ipynb

Folders and files

Latest commit

History

Repository files navigation

Lakebase Memory Accelerator

Overview

Key Features

Architecture

Directory Structure

Getting Started

Prerequisites

Setup Instructions

Step 1: Data and Functions Setup

Step 2: Lakebase Instance Setup

Step 3: Agent Development and Deployment

Step 4: Web Application Deployment

Core Components

LangGraphChatAgent Class

Key Configuration

Available Tools

Usage Examples

Basic Agent Interaction

Resuming Conversations

Streamlit App Usage

Deployment Options

1. Databricks Model Serving

2. Databricks Apps

3. Job Scheduling

Monitoring and Observability

Conversation Queries

MLflow Tracking

Security and Governance

Customization

Adding New Tools

Modifying Agent Behavior

Scaling Configuration

Troubleshooting

Common Issues

Debug Resources

Next Steps

Documentation Links

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages