mm-rag-lab

A compact multimodal RAG playground focused on practical retrieval quality.

Why this repo

mm-rag-lab is a lightweight research repo for testing hybrid retrieval ideas quickly:

lexical + semantic fusion
small, interpretable ranker
image-aware records through text captions/tags

The project is intentionally minimal so it can be used in coursework, ablations, and demos.

Features

JSONL corpus format for text/image metadata
Hashing-based semantic vectors (dependency-free)
BM25-like lexical scoring
Reciprocal-rank fusion for stable ranking
CLI for indexing and querying

Install

python -m venv .venv
source .venv/bin/activate
pip install -e .

Quickstart

python -m mm_rag_lab.cli index \
  --input examples/corpus.jsonl \
  --output examples/index.json

python -m mm_rag_lab.cli query \
  --index examples/index.json \
  --text "compare retrieval fusion methods"

Corpus format

Each line in corpus.jsonl:

{"id":"doc-1","text":"...","modality":"text","tags":["rag","benchmark"]}
{"id":"img-1","text":"chart of encoder-decoder pipeline","modality":"image","tags":["diagram","vlm"]}

Roadmap

Add CLIP/BGE embedding backends
Add reranker plugin API
Add notebook with retrieval error analysis

Acknowledgement

The repo design is inspired by open-source trends in hybrid retrieval and standardized VLM evaluation pipelines.

Dev Notes

This repo prioritizes transparent baselines over heavy dependencies.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
examples		examples
src/mm_rag_lab		src/mm_rag_lab
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mm-rag-lab

Why this repo

Features

Install

Quickstart

Corpus format

Roadmap

Acknowledgement

Dev Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

mm-rag-lab

Why this repo

Features

Install

Quickstart

Corpus format

Roadmap

Acknowledgement

Dev Notes

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages