Skip to content

suvarnak/advanced-rag

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PDF RAG System

Installation

  1. Install uv (Requires Python 3.10+):
pipx install uv
  1. Clone the repository:
git clone <repository-url>
cd advanced-rag
  1. Create and activate virtual environment using uv:
uv venv
.venv\Scripts\activate
  1. Install dependencies using uv sync (reads from pyproject.toml):
uv pip sync pyproject.toml
  1. Install and start Ollama:
ollama pull llama3.2

Note: uv sync ensures exact dependency resolution from pyproject.toml, providing faster and more reliable package installation.

Features and Usage

Key Features

Basic RAG

  • Simple PDF document loading and parsing
  • Basic text chunking
  • Vector store using FAISS
  • Question-answering with Llama2

Advanced RAG

  • Enhanced document chunking with semantic boundaries
  • BAAI/bge-large-en-v1.5 embeddings
  • Contextual compression
  • Source attribution and metadata tracking
  • Similarity score filtering
  • Custom prompt templates
  • Multi-document context handling

Example Usage

  1. Place your PDFs in the data folder:
data/
  ├── document1.pdf
  ├── document2.pdf
  └── document3.pdf
  1. Run a script for basic pdf based simple RAG. It returns answer to simple query "What is the main topic of the PDF documents?"
 python .\pdf_rag.py
 
  1. Run a script for advanced pdf based RAG. It returns answer to query "who is Marcus Aurelius?"
 python .\advanced_pdf_rag.py
 

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages