-
Notifications
You must be signed in to change notification settings - Fork 0
Development
Rana Faraz edited this page Jun 23, 2026
·
1 revision
- Python 3.10, 3.11, or 3.12
- Git
- (Optional) Docker for container-based runs
- (Optional) Ollama for the local LLM backend
git clone https://github.com/ranafaraz/InsightRAG.git
cd InsightRAG
# Create and activate a virtual environment
python -m venv .venv
# Windows:
.venv\Scripts\activate
# macOS/Linux:
source .venv/bin/activate
# Install offline/dev stack (what CI uses — no downloads, no API keys)
pip install -e ".[dev]"
# Optional extras
pip install -e ".[local]" # sentence-transformers + chromadb
pip install -e ".[ui]" # Streamlit# CLI query (ingests docs/ first, answers the question)
python -m rag.cli ask "What does InsightRAG use to rerank candidates?" --path docs/
# FastAPI service
uvicorn app.main:app --reload
# -> http://localhost:8000/docs
# Streamlit demo
streamlit run ui/streamlit_app.py
# Docker (API on port 8000)
docker compose up --build# Full offline test suite (27 tests, no downloads)
pytest -q
# One file
pytest tests/test_pipeline.py -q
# One test
pytest tests/test_pipeline.py::test_pipeline_refuses_unknown -q
# Lint
ruff check .
ruff check . --fix # auto-fix import ordering etc.# Regenerate eval_harness/RESULTS.md (offline, reproducible)
python -m eval_harness.harness
# CI quality gate (exits 1 if metrics regress below floors)
python -m eval_harness.gate
# With real models
pip install -e ".[local]"
EMBEDDING_BACKEND=sentence-transformers RERANK_BACKEND=cross-encoder python -m eval_harness.harnessrag/
config.py Settings (env-driven, pydantic)
pipeline.py RAGPipeline orchestrator — read this first
chunk.py Document chunking
retrieve.py HybridRetriever (BM25 + dense fusion)
rerank.py Reranker wrapper
generate.py Generator (prompt builder, citation verifier, honest refusal)
store.py Vector store factory + in-memory store
text_utils.py Shared tokenization, lexical scoring, stopwords
providers/
embeddings.py Embedder factory (hash | sentence-transformers)
rerank.py Reranker factory (lexical | cross-encoder)
llm.py LLM factory (stub | ollama | openai)
app/
main.py FastAPI app (/health, /ingest/text, /ingest/file, /chat)
guardrails/
injection.py Prompt-injection detection
pii.py PII redaction (regex + optional Presidio)
ui/
streamlit_app.py Streamlit demo
eval_harness/
harness.py Metrics computation, writes RESULTS.md
gate.py CI quality gate (enforces metric floors)
metrics.py recall@k, MRR, faithfulness, RAGAS proxies
data/ Bundled benchmark (questions, documents, answers)
tests/ pytest suite (27 tests, all offline)
docs/ architecture.md, DECISIONS.md, demo GIF
Every component uses a small Protocol defined in rag/providers/. Adding a backend takes three steps:
1. Implement the protocol. For example, to add a new embedder:
# rag/providers/embeddings.py
class MyEmbedder:
def embed(self, texts: list[str]) -> list[list[float]]:
... # your implementation2. Register in the factory function. Add an elif branch in get_embedder():
def get_embedder(settings: Settings) -> Embedder:
if settings.embedding_backend == "hash":
return HashEmbedder()
elif settings.embedding_backend == "my-backend":
return MyEmbedder()
elif settings.embedding_backend == "sentence-transformers":
...3. Add the env var value to docs (update this wiki page and .env.example).
The offline default must continue to pass all tests and the eval gate. If your new backend requires extra dependencies, add them as a named optional in pyproject.toml (e.g., [my-backend]) and import them lazily so the base install is not affected.
GitHub Actions runs on every push and pull request:
-
Lint —
ruff check .(fails fast on lint errors) -
Tests —
pytest -qon Python 3.10, 3.11, 3.12 with offline backends -
Eval gate —
python -m eval_harness.gatewithEMBEDDING_BACKEND=hash RERANK_BACKEND=lexical LLM_BACKEND=stub VECTOR_STORE=memory
All three steps must pass. The eval gate is the last guard that catches quality regressions before merge.
- The venv interpreter is at
.venv\Scripts\python.exe - Windows console is cp1252 — do not
print()non-ASCII characters from scripts; write UTF-8 to files instead, or setPYTHONIOENCODING=utf-8 - The
evalpackage is namedeval_harness(noteval, which would shadow the Python builtin)