Skip to content

ak-abdullah/Grounded

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Grounded

Python FastAPI FAISS React License

Upload your documents. Ask questions. Get answers grounded in your own content.

A full RAG pipeline built from scratch. No managed vector database, no third-party retrieval service. Local embeddings, FAISS, and a clean chat interface on top.


⚡ How it works

flowchart TD
    A[User uploads .txt / .md files] --> B[Document Loader]
    B --> C[Chunker\n500 char chunks, 50 char overlap]
    C --> D[Embedder\nall-MiniLM-L6-v2 local model]
    D --> E[FAISS Index\nsaved per session UUID]
    F[User asks a question] --> G[Query Embedder]
    G --> H[FAISS Similarity Search\nTop-K retrieval]
    E --> H
    H --> I[Prompt Builder]
    I --> J[LLM Generation]
    J --> K[Answer returned to chat UI]

    style A fill:#1e293b,color:#f8fafc,stroke:#334155
    style F fill:#1e293b,color:#f8fafc,stroke:#334155
    style E fill:#0f172a,color:#f8fafc,stroke:#6366f1
    style K fill:#0f172a,color:#f8fafc,stroke:#22c55e
Loading

Each upload creates an isolated session with its own FAISS index. Sessions are UUID-based with path traversal protection built in.


🛠️ Stack

Layer Technology
Backend Python, FastAPI
Vector Search FAISS (flat L2 index)
Embeddings sentence-transformers (all-MiniLM-L6-v2, runs locally)
Frontend React, Vite, Tailwind CSS
Validation Pydantic
Session Management UUID-based, localStorage

📁 Project structure

grounded/
├── backend/
│   ├── app/
│   │   ├── ingestion/        # loader, chunker, embedder, indexer
│   │   ├── retrieval/        # FAISS retriever
│   │   ├── generation/       # prompt builder, LLM call
│   │   ├── config.py         # all settings in one place
│   │   └── main.py           # FastAPI routes
│   └── data/
│       ├── raw/              # source documents
│       ├── processed/        # global vector store
│       └── uploads/          # per-session indexes
└── frontend/
    └── src/
        ├── components/       # ChatWindow, UploadPage, InputBox
        └── services/         # API calls

🚀 Running locally

Backend

cd backend
pip install -r requirements.txt
uvicorn app.main:app --reload

Frontend

cd frontend
npm install
npm run dev

Backend runs on http://localhost:8000. Frontend on http://localhost:5173.


🔌 API

POST /upload
POST /upload
Content-Type: multipart/form-data

session_id: string (UUID)
files: .txt or .md files (max 10MB each, max 20 files per upload)
POST /chat
POST /chat
Content-Type: application/json

{
  "question": "your question here",
  "session_id": "your-session-uuid"
}
GET /health
GET /health

⚙️ Configuration

All settings live in backend/app/config.py:

Setting Default What it controls
CHUNK_SIZE 500 Characters per chunk
CHUNK_OVERLAP 50 Overlap between chunks
EMBEDDING_MODEL_NAME all-MiniLM-L6-v2 Local embedding model
TOP_K 5 Chunks retrieved per query
MAX_FILE_SIZE_BYTES 10MB Per file upload limit
MAX_FILES_PER_UPLOAD 20 Files per session

💡 What I learned building this

Chunking strategy matters more than most people expect. Fixed character chunking with overlap is simple and works well for most plain text. The real tradeoff is chunk size — too small and you lose context, too large and retrieval gets noisy.

Local embeddings with MiniLM keep latency low and cost at zero. For this scale it works well. At production scale with millions of chunks you would move to approximate nearest neighbour search and a proper vector database like Pinecone or Weaviate.

FAISS flat L2 search is exact and accurate. It is the right choice here. At scale you would switch to IVF or HNSW indexes depending on your latency and accuracy tradeoffs.


📬 Contact

Built by Abdullah Khalid

LinkedIn Email Portfolio

About

RAG pipeline over 1000+ pages using FAISS. Upload documents, ask questions, get answers in under 2 seconds. Built with FastAPI and React.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors