RAG document Q&A

a web app that lets you upload documents and ask questions about them. it uses retrieval-augmented generation (RAG) to find relevant passages and generate answers with source citations.

what it does

upload PDF, DOCX, TXT or MD files through the browser
documents are chunked, embedded, and stored in a vector database
questions are answered using hybrid search (dense + BM25) and a cross-encoder re-ranker
answers include citations pointing back to the source document and page
documents can be deleted from the store at any time

tech stack

layer	tool
LLM	OpenAI GPT (via `openai` SDK)
embeddings	`sentence-transformers` — `all-MiniLM-L6-v2`
vector database	ChromaDB (persistent, local)
sparse search	BM25 via `rank-bm25`
re-ranking	cross-encoder `ms-marco-MiniLM-L-6-v2`
PDF parsing	`pypdf` + `pymupdf` + `tesseract` (OCR fallback for scanned PDFs)
backend	FastAPI
frontend	plain HTML / CSS / JS (no framework)

folder structure

rag-qa/
├── app/
│   ├── main.py            # FastAPI app factory
│   ├── models.py          # pydantic request/response schemas
│   ├── dependencies.py    # shared pipeline singleton
│   └── routes/
│       ├── documents.py   # upload, list, delete endpoints
│       ├── qa.py          # ask endpoint
│       └── health.py      # health check endpoint
├── static/
│   └── index.html         # single-page web UI
├── config.py              # all tuneable settings
├── ingest.py              # document parsing and chunking
├── vectorstore.py         # ChromaDB wrapper
├── retrieval.py           # hybrid search + re-ranking
├── rag.py                 # main RAG pipeline
├── server.py              # entry point
└── requirements.txt

how to run locally

1. install dependencies

pip install -r requirements.txt

tesseract is also required for OCR on scanned PDFs. install it with:

# macOS
brew install tesseract

# ubuntu / debian
sudo apt install tesseract-ocr

2. add your OpenAI API key

cp .env.example .env
# open .env and set OPENAI_API_KEY=sk-...

3. start the server

uvicorn server:app --host 0.0.0.0 --port 8000

4. open the app

go to http://localhost:8000 in your browser.

upload a document from the sidebar, then type a question.

configuration

all settings are in config.py. the most useful ones:

setting	default	description
`openai_model`	`gpt-4o-mini`	which GPT model to use
`chunk_size`	`512`	tokens per chunk
`chunk_overlap`	`64`	token overlap between chunks
`top_k_rerank`	`5`	number of chunks passed to the LLM
`hybrid_alpha`	`0.5`	blend between dense (1.0) and sparse (0.0) search

future improvements

user authentication so multiple users can have separate document stores
streaming responses so the answer appears word by word instead of all at once
support for URLs and web pages as input sources
multi-language support for non-english documents
a document preview panel that highlights the cited passages
conversation history so follow-up questions have context
evaluation metrics to measure retrieval and answer quality
docker setup for easier deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RAG document Q&A

what it does

tech stack

folder structure

how to run locally

configuration

future improvements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
app		app
docs		docs
screenshots		screenshots
static		static
uploads		uploads
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
config.py		config.py
ingest.py		ingest.py
main.py		main.py
rag.py		rag.py
requirements.txt		requirements.txt
retrieval.py		retrieval.py
server.py		server.py
vectorstore.py		vectorstore.py

Folders and files

Latest commit

History

Repository files navigation

RAG document Q&A

what it does

tech stack

folder structure

how to run locally

configuration

future improvements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages