Skip to content

neomatrix369/rag-params-finder

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

136 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

rag-params-finder

Python FastAPI Pydantic React TypeScript MongoDB Vite Tailwind CSS

CI Release License: MIT Last Commit Stars

Find your optimal RAG configuration — before you build your RAG application.

RAG parameter sweep experimentation tool — systematically evaluate embedding models, chunking strategies, and retrieval methods using MongoDB Atlas Vector Search. Supports Voyage AI (hosted) and local sentence-transformers (no API key needed).

Most RAG projects start with a guess: pick an embedding model, pick a chunking method, a retrieval method (or a re-ranker), realise it's wrong, refactor. That loop is slow and expensive.

rag-params-finder inverts it.

Give it your data and your questions. It runs every combination — embedding model × chunking method × retrieval method — stores the retrieval scores and shows you exactly which configuration performs best. Before you write a single line of your RAG application.

Jump to: Quickstart | Who is this for? | Screenshots | Key Features | Documentation | Contributing

Why this matters

What you avoid What you get instead
No LLM calls Embedding only — 10–100× cheaper
No eval framework setup One YAML config, one CLI command
No deployed RAG app needed Just your data, your questions, your credentials
No guessing Actual retrieval scores across every config, side by side
No throwaway experiments Results persist — compare runs across sessions

What it sweeps

  • Embedding models: 13 Voyage models (voyage-4 series, domain, context, voyage-3 legacy) — see server/core/model_registry.py
  • Chunking methods: Fixed · Recursive · Token · Sentence · Semantic
  • Retrieval methods: Dense · Sparse · Hybrid
  • Questions: Persona-organised — user provided or generated as part of golden master generation process

One YAML. N experiments. Evidence-based decision. Ship the right config first.

Who is this for?

Pick the row that matches you — each links to a first step, not the whole README.

Persona Start here What you will do
New user — cloud accounts Cloud Account Setup Atlas + optional Voyage, then QUICKSTART
New user — first sweep QUICKSTART Install, run server + CLI, open dashboard
Operator — config & CLI Configuration Reference YAML sweeps, env vars, rag-params-finder commands
Operator — dashboard Dashboard Guide Live phases, Search Explorer, experiment controls
Operator — fixing errors Troubleshooting Indexes, Voyage limits, Docker, storage quota
Contributor — system design Architecture Modules, data flow, ADRs
Contributor — dev setup Development Guide Quality gates, slices, Docker, hooks
Agent / slice worker AGENTS.md · CLAUDE.md PROGRESS → current slice spec

All docs by topic: docs/README.md


📸 Screenshots

Screen Description
Experiments list Experiments list — all submitted sweeps with status badges and run counts
Experiment detail Experiment detail — metric cards, live phase indicator dots, runs table
Search Explorer Search Explorer — best-parameters card, ranked configs with score bars

🚀 Quick Start

See QUICKSTART.md for install, .env, server, dashboard, and first sweep commands (including optional Docker).


🗺️ Choose Your Path

I want to… Start here
Set up MongoDB Atlas or Voyage AI accounts Cloud Account Setup
Run my first experiment Getting Started
Understand all config options Configuration Reference
Learn all CLI commands CLI Reference
Understand the dashboard Dashboard Guide
Fix an error Troubleshooting
Understand the system design Architecture
Add a new model, chunker, or endpoint Extending the System
Set up a development environment Development Guide
Why these design choices? ADR-001 · ADR-002 · ADR-003

⚡ Key Features

🎯 NEW in v0.11.0: Weighted Averaging & Tiebreaker Explanations

  • Weighted averaging (query-level fairness): Each query contributes equally, preventing queries with many results from dominating the average — configurable via TIEBREAKER_METRIC env var (docs)
  • Tiebreaker explanation UI: When multiple configs achieve 100% max score, the dashboard shows amber alerts, explanation panels, visual badges (⭐ "Best by tiebreaker", 🔀 "Tied"), and contextual annotations explaining WHY each config is ranked
  • Detailed Results ↔ Hyperparameters mapping: Chunk size/overlap badges, query text display, and explanatory headers help users map individual results back to aggregated configs
  • Collapsible sweep dimensions panel: Shows unique values for all swept parameters + Cartesian product calculation (dashboard guide)

Core Features

  • 5 chunking methods: Fixed, Recursive, Token, Sentence, Semantic
  • 3 retrieval methods: Dense (vector search), Sparse (BM25), Hybrid (Reciprocal Rank Fusion)
  • Voyage AI models: all registered embeddings in model_registry.py (voyage-4/3/domain/context) + rerankers rerank-2.5-lite, rerank-2.5, and legacy rerank APIs
  • Local models (no API key): all-MiniLM-L6-v2 + cross-encoder/ms-marco-MiniLM-L-6-v2
  • Multi-format data loading: PDF, TXT, Markdown, CSV — files or directories
  • Cartesian sweep: one YAML config → N models × M methods × P sizes × Q overlaps runs
  • Live phase tracking: QUEUED → PARSING → CHUNKING → EMBEDDING → STORING → QUERYING → RERANKING → COMPLETE
  • Experiment management: Pause/resume long sweeps, cancel running experiments, delete with cascade cleanup, boot orphan reconciliation
  • Search index preflight: Validates required Atlas Search indexes and cluster quota before sweeps start; rejects with HTTP 422 when indexes are missing or quota exhausted
  • Atlas index CLI: indexes list and indexes reset for M0 quota troubleshooting
  • Vector DB stats: Cluster and per-experiment chunk/storage estimates; optional Atlas quota bar with tier, provider, and region when Admin API credentials are configured
  • Progress feedback: Byte-level network loading, circular progress with elapsed time and ETA, background polling with "Syncing..." badges
  • Scoped logging: Server and dashboard use [rag-params-finder] [Scope] operation — details format; set LOG_LEVEL=DEBUG for verbose server output
  • Pagination: All list views paginated (10 items per page for experiments/runs, 5 for configs); collapsible experiment rows

🧱 Built With

Backend: FastAPI · Python 3.12 · Pydantic · PyMongo · LangChain text splitters · pypdf · Typer · Rich · sentence-transformers · NLTK · tiktoken

Frontend: React 19 · TypeScript 5.8 · Vite 6 · Tailwind CSS

AI/ML: Voyage AI · sentence-transformers · MongoDB Atlas Vector Search

Dev tools: uv · ruff · mypy · pytest · GitHub Actions


📦 Releases & Versioning

This project follows Semantic Versioning:

  • MAJOR (x.0.0) — Breaking changes
  • MINOR (0.x.0) — New features, completed slices (backward compatible)
  • PATCH (0.0.x) — Bug fixes, polish, enhancements

Current version: v0.11.0 (CHANGELOG.md)

Release history: 15 releases on GitHub documenting development from v0.0.1 (initial skeleton) through v0.11.0 (weighted averaging)

For contributors: See Release Process for how to create new releases


🤝 Contributing

Contributions welcome — please open an issue first to discuss the change.

Running experiments does not require any extra tooling — the user guide path (Atlas, CLI, optional dashboard) is enough.

Contributors use the Development Guide for setup (bash scripts/install-git-hooks.sh — checks on commit and push), quality gates (./scripts/quality-gates.sh), the slice workflow, and release cadence (release when a slice or feature is user-visible; see CHANGELOG Unreleased during development).

Optional AI-assisted development (Cursor / Claude Code with the code-review-graph knowledge graph) is documented in the development guide — it helps navigate this repo faster; it is not part of the RAG sweep runtime.

Priority areas: test suite with mock MongoDB fixtures, Search Explorer dashboard enhancements, SSE live updates.

Agent entry points: AGENTS.md · CLAUDE.md


📄 License

MIT — see LICENSE


🙏 Credits

Inspired by pre-rag-explorer-dashboard.

About

RAG parameter sweep experimentation tool — systematically evaluate vector databases, embedding models, chunking strategies, and retrieval methods using any Vector database. Supports both Voyage AI (hosted) and local sentence-transformers models.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors