Offline RAG-Powered Chatbot for PDF Knowledge Extraction

📂 Project Files (in order)

code_3.py – Main Python script for running the RAG chatbot.
lsb.pdf – Sample PDF document used for testing question-answering.
.gitignore – Configuration file to exclude large model and installer files.

🚀 Project Overview

This project implements an offline Retrieval-Augmented Generation (RAG) chatbot that can answer questions based on the contents of PDF documents.
The chatbot works completely offline, ensuring data privacy, security, and independence from external APIs.

Workflow:

Extract text from PDFs using pdfplumber.
Split text into chunks and generate embeddings using SentenceTransformers.
Store embeddings in DuckDB for efficient vector search.
Retrieve the most relevant chunks for a given query.
Use LangChain to pass the retrieved context into a locally run LLM (LLaMA 3).
Generate accurate, context-aware responses.

🛠️ Technologies & Libraries Used

Python 3.10+
LangChain – for building RAG pipelines
pdfplumber – for extracting text from PDFs
DuckDB – lightweight embedded database
SentenceTransformers – for semantic vector embeddings
Torch – backend for deep learning
Ollama (or other LLaMA runtime) – for running the LLaMA model locally

📦 Requirements

Install dependencies manually with pip, or just copy-paste the below block into your terminal:

pip install langchain
pip install pdfplumber
pip install duckdb
pip install sentence-transformers
pip install torch

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.gitignore		.gitignore
README.md		README.md
app.py		app.py
app_2.py		app_2.py
code_3.py		code_3.py
lsb.pdf		lsb.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Offline RAG-Powered Chatbot for PDF Knowledge Extraction

📂 Project Files (in order)

🚀 Project Overview

🛠️ Technologies & Libraries Used

📦 Requirements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Offline RAG-Powered Chatbot for PDF Knowledge Extraction

📂 Project Files (in order)

🚀 Project Overview

🛠️ Technologies & Libraries Used

📦 Requirements

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages