Skip to content

nprasann/state-policy-rag-starter

state-policy-rag-starter

Python Status MCP RAG Ollama Vector%20DB

state-policy-rag-starter is a starter repository for a retrieval-augmented generation workflow that helps state agencies answer policy questions using approved policy text and tightly scoped case data access.

What It Does

  • Ingests policy PDFs into a Qdrant vector store.
  • Exposes an MCP service for policy search and a strict SQL stored procedure allowlist.
  • Runs a RAG service that only answers from approved context and requires citations.
  • Uses Ollama for in-state model serving so policy and case data do not leave state-controlled infrastructure.
  • Provides a starter governance, deployment, and security package for State IT, Legal, and Procurement teams.

Why This Starter

  • Cost target: less than $15K for a starter deployment on a single state-managed VM plus implementation time.
  • Data stays in-state: documents, vectors, prompts, and generated answers stay on infrastructure operated by or for the agency.
  • Procurement-ready framing: see Security, Deployment, Architecture, and Hardware Setup.

Featured In

Related Repos

Implemented Features

  • Core project scaffolding and Docker-based service configuration for local and pilot deployments
  • RAG service implementation with strict temperature control and citation enforcement
  • MCP server integration supporting policy search, SQL whitelisting, and audit logging
  • Dedicated ingestion pipeline for extracting, chunking, embedding, and indexing policy documents

Model Usage

This project uses two different model roles:

  • Embedding model: used for ingestion and MCP semantic search
  • Generation model: used for final answer generation in the RAG service

Current local development defaults:

  • Hugging Face embedding model: sentence-transformers/all-MiniLM-L6-v2
  • Ollama generation model: llama3:8b-instruct-q4_K_M

Why this separation matters:

  • the embedding model converts policy text and user queries into vectors for semantic retrieval
  • the Ollama model generates the final answer from the retrieved policy context
  • the ingest embedding model and MCP search embedding model must match, or the vector collection will reject the embeddings because of dimension mismatch

For higher-quality production retrieval, the repository also supports larger embedding models such as BAAI/bge-m3, but the smaller MiniLM model provides a faster and more practical local developer experience

Container Runtime Support

This starter supports:

  • Docker as the primary validated local path
  • Podman as an alternative local runtime path

Quick links:

Runtime switching note:

  • Do not run Docker and Podman copies of this stack at the same time on one machine because they compete for the same ports
  • If you switch runtimes, bring the current stack down first, then start the other one

5-Minute Quickstart

macOS is the directly tested local path for this repository. Windows and Linux quickstart instructions below are AI-assisted guidance and should be validated in your environment before production use.

Mac
  1. Clone the repository and enter it.
git clone <your-fork-or-repo-url>
cd state-policy-rag-starter
  1. Create a local environment file.
cp .env.example .env
  1. For local development, set a Hugging Face read token in .env if model downloads are needed.
echo 'HF_TOKEN=your_huggingface_read_token' >> .env
  1. Start the stack.
docker-compose up --build

Optional faster bootstrap:

bash scripts/bootstrap_local.sh

Optional Podman bootstrap:

bash scripts/bootstrap_local_podman.sh

If you switch back to Docker after using Podman:

podman-compose down
docker-compose up --build
  1. In a second shell, install ingest dependencies if needed and ingest a first policy PDF.
python3 -m pip install -r ingest/requirements.txt
QDRANT_PORT=6333 EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2 \
python3 ingest/ingest.py \
  --file examples/sample_policy.pdf \
  --source_name "Sample Policy" \
  --section "General"
  1. Test semantic search.
curl -X POST http://localhost:8080/search_policies \
  -H "Content-Type: application/json" \
  -H "user: test.user@state.gov" \
  -d '{"query":"What does the policy require the State Agency to display on the website home page?"}'
  1. Test the RAG endpoint.
curl -X POST http://localhost:8081/ask \
  -H "Content-Type: application/json" \
  -H "user: test.user@state.gov" \
  -d '{"query":"What does policy say about termination of rights?"}'

If you hit setup or runtime issues during quickstart, see the Beginner Setup Guide, especially the troubleshooting section with copy-paste recovery commands.

Local networking note:

  • If localhost behaves inconsistently on macOS, use 127.0.0.1 for local health, search, and RAG checks instead.
Windows

These steps are AI-assisted guidance. Validate locally before wider rollout.

  1. Clone the repository and enter it in PowerShell.
git clone <your-fork-or-repo-url>
cd state-policy-rag-starter
  1. Create a local environment file.
Copy-Item .env.example .env
  1. Add a Hugging Face read token to .env if model downloads are needed.
Add-Content .env 'HF_TOKEN=your_huggingface_read_token'
  1. Start the stack.
docker-compose up --build
  1. Create a virtual environment, install ingest dependencies, and ingest a policy PDF.
py -3 -m venv .venv
.venv\Scripts\Activate.ps1
python -m pip install -r ingest/requirements.txt
$env:QDRANT_PORT='6333'
$env:EMBEDDING_MODEL='sentence-transformers/all-MiniLM-L6-v2'
python ingest/ingest.py --file examples/sample_policy.pdf --source_name "Sample Policy" --section "General"
  1. Test semantic search.
curl.exe -X POST http://localhost:8080/search_policies -H "Content-Type: application/json" -H "user: test.user@state.gov" -d "{\"query\":\"What does the policy require the State Agency to display on the website home page?\"}"
  1. Test the RAG endpoint.
curl.exe -X POST http://localhost:8081/ask -H "Content-Type: application/json" -H "user: test.user@state.gov" -d "{\"query\":\"What does policy say about termination of rights?\"}"
Linux

These steps are AI-assisted guidance. Validate locally before wider rollout.

  1. Clone the repository and enter it.
git clone <your-fork-or-repo-url>
cd state-policy-rag-starter
  1. Create a local environment file.
cp .env.example .env
  1. Add a Hugging Face read token to .env if model downloads are needed.
echo 'HF_TOKEN=your_huggingface_read_token' >> .env
  1. Start the stack.
docker-compose up --build
  1. Create a virtual environment, install ingest dependencies, and ingest a policy PDF.
python3 -m venv .venv
source .venv/bin/activate
python3 -m pip install -r ingest/requirements.txt
QDRANT_PORT=6333 EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2 \
python3 ingest/ingest.py \
  --file examples/sample_policy.pdf \
  --source_name "Sample Policy" \
  --section "General"
  1. Test semantic search.
curl -X POST http://localhost:8080/search_policies \
  -H "Content-Type: application/json" \
  -H "user: test.user@state.gov" \
  -d '{"query":"What does the policy require the State Agency to display on the website home page?"}'
  1. Test the RAG endpoint.
curl -X POST http://localhost:8081/ask \
  -H "Content-Type: application/json" \
  -H "user: test.user@state.gov" \
  -d '{"query":"What does policy say about termination of rights?"}'

Architecture

flowchart LR
    A["Teams or Web Client"] --> B["rag_service<br/>FastAPI"]
    B --> C["mcp_server<br/>FastAPI"]
    C --> D["Qdrant<br/>policies collection"]
    C --> E["CaseDB<br/>allowed procedures only"]
    B --> F["Ollama<br/>local model serving"]
    G["Policy PDF Ingest"] --> H["sentence-transformers<br/>BAAI/bge-m3"]
    H --> D
Loading

Repo Map

Deployment Planning Docs

Local Development Notes

  • docker-compose is the validated local command path for this repo
  • local ingest writes to host-exposed Qdrant on port 6333
  • the ingest embedding model and MCP search embedding model must match
  • if you change EMBEDDING_MODEL, delete and recreate the policies collection before re-ingesting
  • HF_TOKEN helps avoid slow or rate-limited Hugging Face downloads during first-time model setup
  • bash scripts/bootstrap_local.sh is the fastest way to warm the services, ingest a sample policy, and wait for /ready

Future Roadmap

  • Advanced Semantic Search: improve retrieval precision with re-ranking models layered on top of vector search
  • Automated Data Refresh: move from manual ingestion to a scheduled and repeatable pipeline
  • Expanded Policy Coverage: support additional policy formats such as HTML and DOCX alongside PDF
  • Enhanced UI/UX: develop a dedicated frontend for policy exploration and guided question workflows

Open Source And Feedback

This repository is meant to be open and practical.

  • Feel free to fork it for your own agency, internal prototype, or public-sector adaptation
  • Feedback is welcome from State IT, architects, legal teams, security reviewers, and builders working on responsible AI
  • Issues, suggestions, and improvements are all useful, especially around governance, deployment, and in-state operating models

Intended Outcome

This starter is designed for agencies that need a practical path to policy-grounded assistance without sending protected data to external hosted LLM services and without allowing open-ended SQL access.

If this repository is useful, please consider forking it, sharing feedback, and giving it a star on GitHub.

About

Open-source starter for state agencies to deploy policy-grounded AI with citation-first RAG, MCP-based policy search, controlled SQL access, local Ollama models, and in-state data handling.

Topics

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors