Agentic-RAG — A reference architecture and example implementation combining Retrieval-Augmented Generation (RAG) with an agentic orchestration layer. Built to demonstrate how autonomous agents can query knowledge stores, chain reasoning steps, and call tools while staying grounded on retrieved context.
Agentic-RAG demonstrates a pattern for combining:
- RAG (Retrieval-Augmented Generation) — retrieve relevant documents from a vector store and provide them as grounding context to a language model.
- Agentic control — an orchestration layer that enables multi-step reasoning, tool calls (search, calculators, external APIs), and iterative retrieval.
This repo includes example code for building a local prototype using open-source vector stores, a lightweight agent controller, and a pluggable LLM backend.
RAG alone helps LM outputs stay grounded, but complex tasks often need: multiple retrieval rounds, conditional tool use (e.g., calculators, web search), and explicit step management. An agentic layer enables this control flow while keeping generation grounded to retrieved evidence.
Use cases:
- Question answering over private corpora (docs, knowledge bases)
- Multi-step research assistants (summarization + citation)
- Automated workflows that must consult knowledge stores and external APIs
- Modular vector storage adapter (e.g., FAISS, Chroma)
- Retriever + contextual prompt builder for the LLM
- Agent controller that supports simple action/observation loops
- Example tool adapters (search, calculator, HTTP fetch)
- Example demo scripts and notebook-style walkthroughs
- Client: user interfaces (CLI, notebook, web UI) that submit tasks.
- Agent Controller: orchestrates steps — retrieve, plan, call tool, generate, loop.
- Retriever / Index: vector DB + embeddings to fetch relevant passages.
- LLM Backend: model interface for completion / chat.
- Tools: external capabilities the agent can call (web, calc, DB query).
A simplified flow:
User -> Agent Controller
Agent: retrieve(context) -> LLM(plan)
If tool needed: Agent -> tool -> observation
Agent: retrieve(more) -> LLM(final answer grounded on retrieved docs)
These instructions assume a Python environment. Adjust if you prefer Node.js or another stack.
- Python 3.10+
piporpoetry- Optional: Docker (for containerized run)
- Optional: API keys for external LLM providers (if using hosted models)
# clone repo
git clone https://github.com/akash8190/Agentic-RAG.git
cd Agentic-RAG
# create venv
python -m venv .venv
source .venv/bin/activate # on Windows: .venv\Scripts\activate
# install deps
pip install -r requirements.txtIf you use Poetry:
poetry install
poetry shell- Create a
.envfile with required keys (see Configuration). - Build or load the index (if the repo contains
scripts/build_index.py):
python scripts/build_index.py --data data/docs --index_path ./index- Run the demo agent:
python examples/agent_demo.py --index ./indexOpen notebooks/demo.ipynb for a step-by-step walkthrough (if provided).
Set environment variables in .env or export in your shell. Example variables:
# LLM provider (optional)
OPENAI_API_KEY=sk-...
LLM_PROVIDER=openai
# Vector DB settings
VECTOR_STORE=faiss
EMBEDDING_MODEL=all-mpnet-base-v2
# Other toggles
AGENT_MAX_STEPS=6
RETRIEVAL_TOP_K=5
The project contains an abstraction layer so you can swap in other providers (e.g., local LLMs or cloud vector stores).
python cli/qa.py --question "How does Agentic-RAG handle multi-step reasoning?"Expected behavior:
- Agent retrieves top-K passages
- LLM formulates a plan; if plan contains actions, agent runs them
- Final answer returned with supporting references
from agentic_rag import Agent
agent = Agent(index_path="./index")
answer = agent.ask("Summarize the security considerations for deploying the system.")
print(answer)Tools are small adapters that accept action inputs and return observation. To add a new tool:
- Create a new file
tools/my_tool.pyimplementingrun(action: dict) -> dict. - Register it in the agent's tool registry (see
agentic_rag/tools/__init__.py). - Update the agent policy prompt (if needed) to mention the tool's capabilities.
Example tool skeleton:
# tools/calculator.py
class CalculatorTool:
def run(self, expression: str) -> str:
# safe eval wrapper or use an external math library
return str(eval(expression, {"__builtins__":{}}))Run unit tests with pytest (if present):
pytest testsAdd tests for any new retrievers, tools, or orchestration logic you add.
Contributions — bug fixes, docs improvements, additional adapters — are welcome.
Suggested workflow:
- Fork the repo
- Create a feature branch
- Add tests for new behavior
- Open a Pull Request describing the change
Please follow the code style used across the project and keep functions small and well-documented.
- Grounding: always include retrieved passage identifiers or snippets with answers for traceability.
- Safety: be careful when enabling tools like
evalor arbitrary HTTP requests; sandbox where possible. - Latency: multi-step agentic loops can increase latency — consider caching or limiting steps for production.
This project is distributed under the MIT license. See LICENSE for details.
Maintainer: Akash Kumar (GitHub: akash8190)
If you'd like, I can also:
- Add a
CONTRIBUTING.mdand issue templates - Create example Dockerfile and GitHub Actions workflow
- Generate a short demo notebook with sample queries