Development

Prerequisites

Python 3.10, 3.11, or 3.12
Git
(Optional) Docker for container-based runs
(Optional) Ollama for the local LLM backend

Local setup

git clone https://github.com/ranafaraz/InsightRAG.git
cd InsightRAG

# Create and activate a virtual environment
python -m venv .venv
# Windows:
.venv\Scripts\activate
# macOS/Linux:
source .venv/bin/activate

# Install offline/dev stack (what CI uses — no downloads, no API keys)
pip install -e ".[dev]"

# Optional extras
pip install -e ".[local]"    # sentence-transformers + chromadb
pip install -e ".[ui]"       # Streamlit

Running the project

# CLI query (ingests docs/ first, answers the question)
python -m rag.cli ask "What does InsightRAG use to rerank candidates?" --path docs/

# FastAPI service
uvicorn app.main:app --reload
# -> http://localhost:8000/docs

# Streamlit demo
streamlit run ui/streamlit_app.py

# Docker (API on port 8000)
docker compose up --build

Running tests

# Full offline test suite (27 tests, no downloads)
pytest -q

# One file
pytest tests/test_pipeline.py -q

# One test
pytest tests/test_pipeline.py::test_pipeline_refuses_unknown -q

# Lint
ruff check .
ruff check . --fix    # auto-fix import ordering etc.

Running the eval harness

# Regenerate eval_harness/RESULTS.md (offline, reproducible)
python -m eval_harness.harness

# CI quality gate (exits 1 if metrics regress below floors)
python -m eval_harness.gate

# With real models
pip install -e ".[local]"
EMBEDDING_BACKEND=sentence-transformers RERANK_BACKEND=cross-encoder python -m eval_harness.harness

Project structure

rag/
  config.py              Settings (env-driven, pydantic)
  pipeline.py            RAGPipeline orchestrator — read this first
  chunk.py               Document chunking
  retrieve.py            HybridRetriever (BM25 + dense fusion)
  rerank.py              Reranker wrapper
  generate.py            Generator (prompt builder, citation verifier, honest refusal)
  store.py               Vector store factory + in-memory store
  text_utils.py          Shared tokenization, lexical scoring, stopwords
  providers/
    embeddings.py        Embedder factory (hash | sentence-transformers)
    rerank.py            Reranker factory (lexical | cross-encoder)
    llm.py               LLM factory (stub | ollama | openai)

app/
  main.py                FastAPI app (/health, /ingest/text, /ingest/file, /chat)

guardrails/
  injection.py           Prompt-injection detection
  pii.py                 PII redaction (regex + optional Presidio)

ui/
  streamlit_app.py       Streamlit demo

eval_harness/
  harness.py             Metrics computation, writes RESULTS.md
  gate.py                CI quality gate (enforces metric floors)
  metrics.py             recall@k, MRR, faithfulness, RAGAS proxies
  data/                  Bundled benchmark (questions, documents, answers)

tests/                   pytest suite (27 tests, all offline)
docs/                    architecture.md, DECISIONS.md, demo GIF

How to add a new backend

Every component uses a small Protocol defined in rag/providers/. Adding a backend takes three steps:

1. Implement the protocol. For example, to add a new embedder:

# rag/providers/embeddings.py
class MyEmbedder:
    def embed(self, texts: list[str]) -> list[list[float]]:
        ...  # your implementation

2. Register in the factory function. Add an elif branch in get_embedder():

def get_embedder(settings: Settings) -> Embedder:
    if settings.embedding_backend == "hash":
        return HashEmbedder()
    elif settings.embedding_backend == "my-backend":
        return MyEmbedder()
    elif settings.embedding_backend == "sentence-transformers":
        ...

3. Add the env var value to docs (update this wiki page and .env.example).

The offline default must continue to pass all tests and the eval gate. If your new backend requires extra dependencies, add them as a named optional in pyproject.toml (e.g., [my-backend]) and import them lazily so the base install is not affected.

CI

GitHub Actions runs on every push and pull request:

Lint — ruff check . (fails fast on lint errors)
Tests — pytest -q on Python 3.10, 3.11, 3.12 with offline backends
Eval gate — python -m eval_harness.gate with EMBEDDING_BACKEND=hash RERANK_BACKEND=lexical LLM_BACKEND=stub VECTOR_STORE=memory

All three steps must pass. The eval gate is the last guard that catches quality regressions before merge.

Windows notes

The venv interpreter is at .venv\Scripts\python.exe
Windows console is cp1252 — do not print() non-ASCII characters from scripts; write UTF-8 to files instead, or set PYTHONIOENCODING=utf-8
The eval package is named eval_harness (not eval, which would shadow the Python builtin)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Development

Development

Prerequisites

Local setup

Running the project

Running tests

Running the eval harness

Project structure

How to add a new backend

CI

Windows notes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally