A robust Retrieval-Augmented Generation (RAG) system built with LangGraph and Google Gemini. This system goes beyond simple RAG by implementing a self-corrective mechanism: it evaluates the relevance of retrieved documents and, if necessary, rewrites the search query to find better results.
- 🔍 Self-Correction: Automatically detects when retrieved documents are irrelevant.
- 🔄 Query Transformation: Re-writes user queries to optimize them for vector search if initial retrieval fails.
- ⚖️ Relevance Grading: Uses an LLM (Gemini) to grade the relevance of each retrieved document.
- 🛡️ Hallucination Prevention: Filters out irrelevant context before generating answers.
- 📊 State Management: Uses Pydantic models for robust state handling throughout the graph.
- Orchestration: LangGraph
- LLM: Google Gemini (
gemini-2.5-flash) - Vector Store: ChromaDB
- Embeddings: HuggingFace (
BAAI/bge-small-en-v1.5) - Framework: LangChain
- Validation: Pydantic
-
Clone the repository:
git clone https://github.com/MahdiAmrollahi/Self-Corrective-RAG.git cd Self-Corrective-RAG -
Create a virtual environment (optional but recommended):
python -m venv env # Windows .\env\Scripts\activate # Linux/Mac source env/bin/activate
-
Install dependencies:
pip install -r requirements.txt
-
Set up Environment Variables: Create a
.envfile in the root directory and add your Google API Key:GOOGLE_API_KEY=your_google_api_key_here
First, you need to process your documents and save them into the vector database. Place your documents (PDF, PPTX, TXT) in the data/ folder and run:
python one.pyThis script will load documents, split them into chunks, embed them, and store them in ChromaDB.
To interact with the system and test the graph flow:
python test_graph.pyYou will be prompted to enter a query. The system will then:
- Retrieve documents.
- Grade them.
- If relevant -> Generate an answer.
- If not relevant -> Rewrite query and try again (up to 3 times).
To generate an image of the workflow graph:
python save_graph_image.pyThis will create a workflow.png file showing the nodes and edges.
The system follows a state machine workflow defined in main.py:
graph TD
Start([Start]) --> Retrieve[🔍 Retrieve]
Retrieve --> Grade[⚖️ Grade Documents]
Grade --> Check{Relevant Docs Found?}
Check -- Yes --> Generate[💡 Generate Answer]
Check -- No --> Decision{Max Retries?}
Decision -- No --> Transform[🔄 Transform Query]
Decision -- Yes --> Generate
Transform --> Retrieve
Generate --> End([End])
- Retrieve: Fetches top-k documents from ChromaDB based on the user query.
- Grade Documents: The LLM evaluates each document. If it's irrelevant, it's discarded.
- Decide to Generate:
- If we have relevant documents, proceed to Generate.
- If NO relevant documents are found, proceed to Transform Query (unless max retries reached).
- Transform Query: The LLM rewrites the query to be more semantic and optimized for retrieval, then loops back to Retrieve.
- Generate: Produces the final answer using the filtered, relevant context.
.
├── data/ # Source documents (PDFs, etc.)
├── chroma_db/ # Vector database storage (generated)
├── main.py # Core logic: Graph definition, Nodes, Pydantic models
├── one.py # Data ingestion script
├── test_graph.py # Script to run and test the system
├── save_graph_image.py # Utility to visualize the graph
├── requirements.txt # Python dependencies
└── README.md # Project documentation
Contributions are welcome! Please feel free to submit a Pull Request.