A secure, offline RAG (Retrieval-Augmented Generation) system for chatting with your private documents.
- Private AI: Demonstrates implementation of local LLM inference (Ollama) without data leaving the machine.
- RAG Architecture: Showcases mastery of vector embeddings, document chunking, and similarity search.
- Production CLI: Built with
TyperandRichfor a professional, interactive terminal experience.
- AI Core: Ollama (Llama 3 / Mistral)
- Vector DB: ChromaDB
- Embeddings: Sentence-Transformers (
all-MiniLM-L6-v2) - CLI Framework: Typer, Rich (for beautiful UI)
- Prerequisites: Ollama must be installed and running.
- Install Dependencies:
pip install -r requirements.txt
- Ingest Documents:
python main.py ingest docs/my_data.pdf
- Chat:
python main.py chat
- Semantic Search: Uses cosine similarity to find relevant context in seconds.
- Memory: Context-aware chat history for follow-up questions.
- Multi-format Support: Ingests PDF, DOCX, and TXT files.
Part of the 10-Project OSS Portfolio.