An AI-powered Retrieval-Augmented Generation (RAG) system that lets you chat with your documents. Upload PDF, DOCX, or TXT files and ask questions. Answers are grounded in your document content with source references.
Try it live: eugen-goebel-smart-doc-qa-app-av3twb.streamlit.app. Runs in Demo Mode (no API key required) so you can test the full RAG retrieval flow. Add your own Anthropic key in the sidebar for AI-generated answers.
Demo Mode: clean landing view; runs without an API key using raw retrieval results

Question Answered: asking about 2025 revenue returns the most relevant chunk with source reference

Retrieved Chunks: similarity search surfaces multiple ranked matches across the document

┌─────────────┐ ┌──────────┐ ┌──────────────┐ ┌───────────┐
│ Document │───▶│ Text │───▶│ Vector │───▶│ Stored │
│ Upload │ │ Chunker │ │ Store │ │ Chunks │
│ (PDF/DOCX) │ │ (split) │ │ (ChromaDB) │ │ (embed) │
└─────────────┘ └──────────┘ └──────────────┘ └─────┬─────┘
│
┌─────────────┐ ┌──────────┐ ┌──────────────┐ │
│ Answer + │◀───│ LLM │◀───│ Relevant │◀─────────┘
│ Sources │ │ API │ │ Chunks │ (similarity
└─────────────┘ └──────────┘ └──────────────┘ search)
- Document Loading: Reads PDF, DOCX, or TXT files and extracts plain text
- Chunking: Splits text into overlapping ~500-character pieces
- Embedding & Storage: Each chunk is converted to a vector and stored in ChromaDB
- Retrieval: When you ask a question, the most relevant chunks are found via similarity search
- Generation: The LLM answers your question using only the retrieved context
# Clone and setup
git clone https://github.com/eugen-goebel/smart-doc-qa.git
cd smart-doc-qa
python3 -m venv venv && source venv/bin/activate
pip install -r requirements.txt
# Configure API key
cp .env.example .env
# Edit .env and add your Anthropic API key
# Run the app
streamlit run app.pyThe app opens in your browser. Upload a document and start asking questions.
A sample company report is included at data/sample_company_report.txt. Upload it in the app and try questions like:
- "What was the company's revenue in 2025?"
- "Who are the main competitors?"
- "What are the strategic priorities for 2026?"
- "Tell me about the BMW case study"
smart-doc-qa/
├── app.py # Streamlit web interface
├── agents/
│ ├── document_loader.py # Reads PDF/DOCX/TXT files
│ ├── chunker.py # Splits text into overlapping chunks
│ ├── vectorstore.py # ChromaDB wrapper for similarity search
│ └── qa_agent.py # RAG pipeline: retrieve + generate
├── data/
│ └── sample_company_report.txt # Sample document for testing
├── tests/
│ ├── test_document_loader.py # 10 tests
│ ├── test_chunker.py # 15 tests
│ ├── test_vectorstore.py # 16 tests
│ └── test_qa_agent.py # 14 tests
├── requirements.txt
└── README.md
| Agent | Purpose | API Call? |
|---|---|---|
| DocumentLoader | Reads PDF, DOCX, TXT files and extracts text | No |
| TextChunker | Splits text into overlapping chunks for search | No |
| VectorStore | Stores chunks and finds relevant ones via similarity | No (local embeddings) |
| QAAgent | Sends relevant chunks + question to the LLM for answers | Yes (Anthropic API) |
Retrieval-Augmented Generation combines search with AI generation:
- Instead of sending an entire document to the AI (expensive, limited by context window)
- We first search for the most relevant parts, then send only those to the AI
- This is faster, cheaper, and produces more accurate answers
Text is converted into lists of numbers (vectors) that capture meaning. Similar texts have similar vectors. ChromaDB uses the all-MiniLM-L6-v2 model to generate these embeddings locally, no API key needed.
Documents are split into overlapping pieces (~500 chars each). The overlap ensures no sentence is cut without context at chunk boundaries.
| Component | Technology | Purpose |
|---|---|---|
| AI | Anthropic API | Answer generation from context |
| Vector DB | ChromaDB | Embedding storage and similarity search |
| Embeddings | all-MiniLM-L6-v2 | Local text-to-vector conversion |
| Data Models | Pydantic v2 | Type-safe data validation |
| Web UI | Streamlit | Interactive chat interface |
| PDF Reading | pypdf | PDF text extraction |
| DOCX Reading | python-docx | Word document text extraction |
| Testing | pytest | 55+ unit and integration tests |
# Run all tests
pytest tests/ -v
# Run tests for a specific agent
pytest tests/test_vectorstore.py -vAll tests run without an API key. The QA agent tests use mocked API responses.
This app is designed to deploy in one click on Streamlit Community Cloud (free tier).
Steps:
- Sign in at share.streamlit.io with your GitHub account.
- Click New app and pick this repository / branch /
app.py. - (Optional) In Advanced settings → Secrets, paste:
See
ANTHROPIC_API_KEY = "sk-ant-..."
.streamlit/secrets.toml.example. - Click Deploy. The app builds in ~2 minutes.
API key handling:
The app reads the key from three places, in this order:
os.environ["ANTHROPIC_API_KEY"]: set via.envfor local runsst.secrets["ANTHROPIC_API_KEY"]: set in Streamlit Cloud dashboard- Manual entry in the sidebar, fallback for end users
If no key is provided, the app runs in Demo Mode: vector search still works, but the model-generated answer step is skipped and the raw retrieved chunks are shown instead.
MIT