rag-params-finder

Find your optimal RAG configuration — before you build your RAG application.

RAG parameter sweep experimentation tool — systematically evaluate embedding models, chunking strategies, and retrieval methods using MongoDB Atlas Vector Search. Supports Voyage AI (hosted) and local sentence-transformers (no API key needed).

Most RAG projects start with a guess: pick an embedding model, pick a chunking method, a retrieval method (or a re-ranker), realise it's wrong, refactor. That loop is slow and expensive.

rag-params-finder inverts it.

Give it your data and your questions. It runs every combination — embedding model × chunking method × retrieval method — stores the retrieval scores and shows you exactly which configuration performs best. Before you write a single line of your RAG application.

Why this matters

What you avoid	What you get instead
No LLM calls	Embedding only — 10–100× cheaper
No eval framework setup	One YAML config, one CLI command
No deployed RAG app needed	Just your data, your questions, your credentials
No guessing	Actual retrieval scores across every config, side by side
No throwaway experiments	Results persist — compare runs across sessions

What it sweeps

Embedding models: 13 Voyage models (voyage-4 series, domain, context, voyage-3 legacy) — see server/core/model_registry.py
Chunking methods: Fixed · Recursive · Token · Sentence · Semantic
Retrieval methods: Dense · Sparse · Hybrid
Questions: Persona-organised — user provided or generated as part of golden master generation process

One YAML. N experiments. Evidence-based decision. Ship the right config first.

Who is this for?

Pick the row that matches you — each links to a first step, not the whole README.

Persona	Start here	What you will do
New user — cloud accounts	Cloud Account Setup	Atlas + optional Voyage, then QUICKSTART
New user — first sweep	QUICKSTART	Install, run server + CLI, open dashboard
Operator — config & CLI	Configuration Reference	YAML sweeps, env vars, `rag-params-finder` commands
Operator — dashboard	Dashboard Guide	Live phases, Search Explorer, experiment controls
Operator — fixing errors	Troubleshooting	Indexes, Voyage limits, Docker, storage quota
Contributor — system design	Architecture	Modules, data flow, ADRs
Contributor — dev setup	Development Guide	Quality gates, slices, Docker, hooks
Agent / slice worker	AGENTS.md · CLAUDE.md	PROGRESS → current slice spec

All docs by topic: docs/README.md

📸 Screenshots

Screen	Description
	Experiments list — all submitted sweeps with status badges and run counts
	Experiment detail — metric cards, live phase indicator dots, runs table
	Search Explorer — best-parameters card, ranked configs with score bars

🚀 Quick Start

See QUICKSTART.md for install, .env, server, dashboard, and first sweep commands (including optional Docker).

🗺️ Choose Your Path

I want to…	Start here
Set up MongoDB Atlas or Voyage AI accounts	Cloud Account Setup
Run my first experiment	Getting Started
Understand all config options	Configuration Reference
Learn all CLI commands	CLI Reference
Understand the dashboard	Dashboard Guide
Fix an error	Troubleshooting
Understand the system design	Architecture
Add a new model, chunker, or endpoint	Extending the System
Set up a development environment	Development Guide
Why these design choices?	ADR-001 · ADR-002 · ADR-003

⚡ Key Features

🎯 NEW in v0.11.0: Weighted Averaging & Tiebreaker Explanations

Weighted averaging (query-level fairness): Each query contributes equally, preventing queries with many results from dominating the average — configurable via TIEBREAKER_METRIC env var (docs)
Tiebreaker explanation UI: When multiple configs achieve 100% max score, the dashboard shows amber alerts, explanation panels, visual badges (⭐ "Best by tiebreaker", 🔀 "Tied"), and contextual annotations explaining WHY each config is ranked
Detailed Results ↔ Hyperparameters mapping: Chunk size/overlap badges, query text display, and explanatory headers help users map individual results back to aggregated configs
Collapsible sweep dimensions panel: Shows unique values for all swept parameters + Cartesian product calculation (dashboard guide)

Core Features

5 chunking methods: Fixed, Recursive, Token, Sentence, Semantic
3 retrieval methods: Dense (vector search), Sparse (BM25), Hybrid (Reciprocal Rank Fusion)
Voyage AI models: all registered embeddings in model_registry.py (voyage-4/3/domain/context) + rerankers rerank-2.5-lite, rerank-2.5, and legacy rerank APIs
Local models (no API key): all-MiniLM-L6-v2 + cross-encoder/ms-marco-MiniLM-L-6-v2
Multi-format data loading: PDF, TXT, Markdown, CSV — files or directories
Cartesian sweep: one YAML config → N models × M methods × P sizes × Q overlaps runs
Live phase tracking: QUEUED → PARSING → CHUNKING → EMBEDDING → STORING → QUERYING → RERANKING → COMPLETE
Experiment management: Pause/resume long sweeps, cancel running experiments, delete with cascade cleanup, boot orphan reconciliation
Search index preflight: Validates required Atlas Search indexes and cluster quota before sweeps start; rejects with HTTP 422 when indexes are missing or quota exhausted
Atlas index CLI: indexes list and indexes reset for M0 quota troubleshooting
Vector DB stats: Cluster and per-experiment chunk/storage estimates; optional Atlas quota bar with tier, provider, and region when Admin API credentials are configured
Progress feedback: Byte-level network loading, circular progress with elapsed time and ETA, background polling with "Syncing..." badges
Scoped logging: Server and dashboard use [rag-params-finder] [Scope] operation — details format; set LOG_LEVEL=DEBUG for verbose server output
Pagination: All list views paginated (10 items per page for experiments/runs, 5 for configs); collapsible experiment rows

🧱 Built With

Backend: FastAPI · Python 3.12 · Pydantic · PyMongo · LangChain text splitters · pypdf · Typer · Rich · sentence-transformers · NLTK · tiktoken

Frontend: React 19 · TypeScript 5.8 · Vite 6 · Tailwind CSS

AI/ML: Voyage AI · sentence-transformers · MongoDB Atlas Vector Search

Dev tools: uv · ruff · mypy · pytest · GitHub Actions

📦 Releases & Versioning

This project follows Semantic Versioning:

MAJOR (x.0.0) — Breaking changes
MINOR (0.x.0) — New features, completed slices (backward compatible)
PATCH (0.0.x) — Bug fixes, polish, enhancements

Current version: v0.11.0 (CHANGELOG.md)

Release history: 15 releases on GitHub documenting development from v0.0.1 (initial skeleton) through v0.11.0 (weighted averaging)

For contributors: See Release Process for how to create new releases

🤝 Contributing

Contributions welcome — please open an issue first to discuss the change.

Running experiments does not require any extra tooling — the user guide path (Atlas, CLI, optional dashboard) is enough.

Contributors use the Development Guide for setup (bash scripts/install-git-hooks.sh — checks on commit and push), quality gates (./scripts/quality-gates.sh), the slice workflow, and release cadence (release when a slice or feature is user-visible; see CHANGELOG Unreleased during development).

Optional AI-assisted development (Cursor / Claude Code with the code-review-graph knowledge graph) is documented in the development guide — it helps navigate this repo faster; it is not part of the RAG sweep runtime.

Priority areas: test suite with mock MongoDB fixtures, Search Explorer dashboard enhancements, SSE live updates.

Agent entry points: AGENTS.md · CLAUDE.md

📄 License

MIT — see LICENSE

🙏 Credits

Inspired by pre-rag-explorer-dashboard.

Name		Name	Last commit message	Last commit date
Latest commit History 136 Commits
.github		.github
cli		cli
configs		configs
docker		docker
docs		docs
frontend		frontend
scripts		scripts
server		server
tests		tests
.dockerignore		.dockerignore
.editorconfig		.editorconfig
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitleaks.toml		.gitleaks.toml
.markdownlint.json		.markdownlint.json
.nvmrc		.nvmrc
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
QUICKSTART.md		QUICKSTART.md
README.md		README.md
VERIFICATION_CHECKLIST.md		VERIFICATION_CHECKLIST.md
docker-compose.dev.yml		docker-compose.dev.yml
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
setup.sh		setup.sh
start-services.sh		start-services.sh
stop-services.sh		stop-services.sh
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

rag-params-finder

Why this matters

What it sweeps

Who is this for?

📸 Screenshots

🚀 Quick Start

🗺️ Choose Your Path

⚡ Key Features

🎯 NEW in v0.11.0: Weighted Averaging & Tiebreaker Explanations

Core Features

🧱 Built With

📦 Releases & Versioning

🤝 Contributing

📄 License

🙏 Credits

About

Uh oh!

Releases 16

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

rag-params-finder

Why this matters

What it sweeps

Who is this for?

📸 Screenshots

🚀 Quick Start

🗺️ Choose Your Path

⚡ Key Features

🎯 NEW in v0.11.0: Weighted Averaging & Tiebreaker Explanations

Core Features

🧱 Built With

📦 Releases & Versioning

🤝 Contributing

📄 License

🙏 Credits

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 16

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages