A compact multimodal RAG playground focused on practical retrieval quality.
mm-rag-lab is a lightweight research repo for testing hybrid retrieval ideas quickly:
- lexical + semantic fusion
- small, interpretable ranker
- image-aware records through text captions/tags
The project is intentionally minimal so it can be used in coursework, ablations, and demos.
- JSONL corpus format for text/image metadata
- Hashing-based semantic vectors (dependency-free)
- BM25-like lexical scoring
- Reciprocal-rank fusion for stable ranking
- CLI for indexing and querying
python -m venv .venv
source .venv/bin/activate
pip install -e .python -m mm_rag_lab.cli index \
--input examples/corpus.jsonl \
--output examples/index.json
python -m mm_rag_lab.cli query \
--index examples/index.json \
--text "compare retrieval fusion methods"Each line in corpus.jsonl:
{"id":"doc-1","text":"...","modality":"text","tags":["rag","benchmark"]}
{"id":"img-1","text":"chart of encoder-decoder pipeline","modality":"image","tags":["diagram","vlm"]}- Add CLIP/BGE embedding backends
- Add reranker plugin API
- Add notebook with retrieval error analysis
The repo design is inspired by open-source trends in hybrid retrieval and standardized VLM evaluation pipelines.
This repo prioritizes transparent baselines over heavy dependencies.