Agentic RAG System

A production-style Corrective RAG (CRAG) pipeline built with LangGraph, FAISS, Groq LLaMA-3.3-70b, and Streamlit. Unlike a basic retrieve-then-answer loop, the system uses an LLM to grade every retrieval for relevance, automatically rewrites and retries the query when the first pass falls short, and only then synthesises a grounded answer — all orchestrated as a compiled LangGraph state machine.

Architecture

  ┌──────────────┐
  │  User Query  │
  └──────┬───────┘
         │
         ▼
  ┌──────────────────┐
  │  retrieve_docs   │  FAISS semantic search over indexed documents
  └──────┬───────────┘
         │
         ▼
  ┌──────────────────┐
  │   grade_docs     │  LLM binary relevance check on retrieved passages
  └──────┬───────────┘
         │
    ─────┴──────────────────────────
    │                              │
 relevant                     not relevant
    │                              │
    │                  ┌───────────────────────┐
    │                  │  rewrite_and_retrieve  │  LLM rewrites query,
    │                  │  (max 1 retry)         │  re-fetches from FAISS
    │                  └───────────┬───────────┘
    │                              │
    └──────────────┬───────────────┘
                   │
                   ▼
  ┌──────────────────┐
  │ generate_answer  │  Direct LLM call with graded context
  └──────┬───────────┘
         │
         ▼
  ┌───────────────┐
  │  Final Answer │
  └───────────────┘

State (RAGState) flows through each node carrying question, retrieved_docs, grade, and answer.
The 1-retry ceiling is enforced structurally — there is no edge from rewrite_and_retrieve back to grade_docs.

Tech Stack

Layer	Technology
LLM	Groq — `llama-3.3-70b-versatile`
Orchestration	LangGraph `StateGraph` with conditional routing
RAG pattern	Corrective RAG — LLM-based relevance grading + query rewriting
Vector store	FAISS (in-memory)
Embeddings	`sentence-transformers/all-MiniLM-L6-v2` (local, no API cost)
Document loaders	LangChain — Web, PDF, TXT
UI	Streamlit
Evaluation	Cosine similarity via `sentence-transformers`
Package management	`pip` / `uv`

Setup

1. Clone the repository

git clone https://github.com/your-username/agentic-rag.git
cd agentic-rag

2. Create a virtual environment

python -m venv .venv

# Windows
.venv\Scripts\activate

# macOS / Linux
source .venv/bin/activate

3. Install dependencies

pip install -r requirements.txt

4. Add your Groq API key

Get a free key at console.groq.com → API Keys, then:

# Windows
copy .env.example .env

# macOS / Linux
cp .env.example .env

Open .env and replace the placeholder:

GROQ_API_KEY="gsk_your_actual_key_here"

5. Run the Streamlit app

streamlit run streamlit_app.py

The app fetches and indexes the default source documents on first launch (~30 seconds), then the chat interface is ready.

Example Questions

Questions drawn from the indexed sources (Lilian Weng's blog posts on LLM agents and video diffusion):

What are the three main components of an LLM-powered autonomous agent?
How does chain-of-thought prompting help agents solve complex tasks?
What types of memory does an LLM agent use, and how do they differ?
What are the main challenges of applying diffusion models to video generation?
What role does classifier-free guidance play in diffusion models?

Running the Evaluator

eval.py scores the system against 6 reference Q&A pairs using cosine similarity between expected and actual answers:

python eval.py

Results are printed to the terminal and saved to eval_results.json.

Project Structure

agentic-rag/
│
├── src/
│   ├── config/
│   │   └── config.py              # Groq API key, model name, chunk settings
│   ├── document_ingestion/
│   │   └── document_processor.py  # Web / PDF / TXT loaders + text splitter
│   ├── graph_builder/
│   │   └── graph_builder.py       # LangGraph StateGraph definition
│   ├── node/
│   │   └── nodes.py               # retrieve_docs, grade_docs,
│   │                              # rewrite_and_retrieve, generate_answer
│   ├── state/
│   │   └── rag_state.py           # RAGState Pydantic model
│   └── vectorstore/
│       └── vectorstore.py         # FAISS + HuggingFace embeddings
│
├── data/
│   ├── attention.pdf              # Optional local PDF source
│   └── url.txt                    # Default source URLs
│
├── streamlit_app.py               # Streamlit UI entry point
├── main.py                        # CLI entry point with interactive mode
├── eval.py                        # Evaluation harness (cosine similarity)
├── requirements.txt               # Pip dependencies
├── pyproject.toml                 # Project metadata and pinned deps
├── .env.example                   # Environment variable template
└── README.md

How It Works

Ingestion — URLs and local files are loaded, split into 500-token chunks, and embedded with all-MiniLM-L6-v2 into an in-memory FAISS index.
Retrieval — retrieve_docs runs a cosine-similarity search and returns the top-k passages.
Grading — grade_docs sends the passages to the LLM with a binary relevance prompt (relevant / not relevant). Short-circuits without an LLM call when no docs are returned.
Correction — if graded not_relevant, rewrite_and_retrieve asks the LLM to reformulate the query, then re-fetches. Max one retry — enforced by graph topology.
Generation — generate_answer calls the LLM with the graded context and returns a concise, grounded answer.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Agentic RAG System

Architecture

Tech Stack

Setup

1. Clone the repository

2. Create a virtual environment

3. Install dependencies

4. Add your Groq API key

5. Run the Streamlit app

Example Questions

Running the Evaluator

Project Structure

How It Works

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data		data
src		src
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
eval.py		eval.py
main.py		main.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
streamlit_app.py		streamlit_app.py
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

Agentic RAG System

Architecture

Tech Stack

Setup

1. Clone the repository

2. Create a virtual environment

3. Install dependencies

4. Add your Groq API key

5. Run the Streamlit app

Example Questions

Running the Evaluator

Project Structure

How It Works

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages