Local RAG with Ollama

A Retrieval-Augmented Generation (RAG) application that uses local language models via Ollama. This application allows you to:

Chat with documents by uploading PDF files
Get answers derived only from the content of your documents
Configure retrieval parameters for better results

Requirements

Ollama (for local LLM usage)
Python 3.9+
PyMuPDF (for PDF document loading)
FAISS (for vector storage)
Docker and Docker Compose (for containerized setup)

Setup

Install Ollama
- Download and install Ollama for your OS
- Verify installation with: ollama help
Download a model
- Run: ollama pull deepseek-r1:7b (or another compatible model)
- Wait for the model to download

Setup Python environment

# Clone the repository
git clone https://github.com/yourusername/local-rag-ollama.git
cd local-rag-ollama

# Create and activate virtual environment
python -m venv venv

# For Linux/Mac
source venv/bin/activate

# For Windows
.\venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

Docker Setup

This application can be easily deployed using Docker and Docker Compose:

Clone the repository

git clone https://github.com/yourusername/local-rag-ollama.git
cd local-rag-ollama

Using helper scripts (recommended)
- For Linux/Mac:
```
# Make the script executable
chmod +x docker-start.sh

# Run the helper script to build, start and pull the model
./docker-start.sh
```
- For Windows (PowerShell):
```
# Run the PowerShell helper script
.\docker-start.ps1
```
These scripts will:
- Check if Docker is installed and running
- Build and start the containers
- Pull the necessary model if it doesn't exist
- Provide instructions for accessing the application

Manual setup

Build and start containers:
```
docker-compose up -d
```

Pull the model:

docker-compose exec ollama ollama pull deepseek-r1:7b

Access the application
- Open your browser and navigate to http://localhost:8000

Windows Users

If you're using Windows, here are some specific tips:

Ensure Docker Desktop for Windows is installed and running
You may need to enable WSL2 (Windows Subsystem for Linux) during Docker Desktop installation
If using the default CMD or PowerShell terminal, commands should work the same as shown above
For file paths in volumes, you may need to use Windows-style paths with Docker Desktop

Docker Configuration Notes:

The docker-compose.yml includes:
- An Ollama service that runs the language model
- The RAG application service connected to Ollama
- GPU support for Ollama if available
- Health checks for both services
- Persistent volume for Ollama models
For GPU support:
- Ensure NVIDIA Container Toolkit is installed
- For Windows, use NVIDIA Container Runtime with Docker Desktop
- The Docker Compose configuration automatically detects and uses available GPUs
Environment variables:
- OLLAMA_HOST: Set to "ollama" (the service name) for inter-container communication
- PORT: Application port (default is 8000)

Launch

Start Ollama service
```
# In a separate terminal
ollama serve
```

Start the Chainlit application

# Basic usage
chainlit run app.py

# With custom port
chainlit run app.py --port 8080

Access the application
- Open your browser and navigate to: http://localhost:8000 (or your custom port)

Common Issues and Troubleshooting

PyMuPDF Installation Issues

If you encounter errors related to PDF loading:

# Install PyMuPDF separately
pip install pymupdf==1.23.21

Embedding Dimension Mismatch

If you see errors about dimension mismatch:

This happens when the FAISS vector store was created with a different embedding model than currently used
Select "Empty db" when starting the application to create a fresh database
Adjust the "How similar should the pieces be?" slider to a lower value (around 0.1) to handle potential negative scores

Negative Relevance Scores

Ollama embeddings can sometimes produce negative similarity scores. The application has been updated to handle this by:

Setting a much lower score threshold (0.1 instead of 0.5)
Using proper error handling to catch and recover from issues

Model Loading Issues

If Ollama fails to load the model:

Ensure Ollama is running (ollama serve or Docker container is up)
Verify the model is downloaded (ollama list or via Docker: docker-compose exec ollama ollama list)
Try a different model if needed (adjust in app.py)

Docker Issues

Cannot connect to Ollama from app container:
- Check if the Ollama container is healthy: docker-compose ps
- Verify the model is downloaded: docker-compose exec ollama ollama list
- Check logs: docker-compose logs ollama
Application container fails to start:
- Check logs: docker-compose logs rag-app
- Ensure Ollama container is running first
- Verify network connectivity between containers

Examples

Multilingual Support

This application includes a novel approach to handling non-English queries:

When a non-English query is received, the LLM generates a potential answer (hallucination)
This hallucination is then used to search the vector database for relevant document sections
If relevant sections are found, they are used to generate a proper response

This technique allows the application to work with queries in languages other than English, even when the embedding model only supports English.

Example with a Slovak question:

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
full-db		full-db
imgs		imgs
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
app.py		app.py
docker-compose.yml		docker-compose.yml
docker-start.ps1		docker-start.ps1
docker-start.sh		docker-start.sh
entrypoint.sh		entrypoint.sh
loadingModule.py		loadingModule.py
ollamaModule.py		ollamaModule.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Local RAG with Ollama

Table of Contents

Requirements

Setup

Docker Setup

Windows Users

Docker Configuration Notes:

Launch

Common Issues and Troubleshooting

PyMuPDF Installation Issues

Embedding Dimension Mismatch

Negative Relevance Scores

Model Loading Issues

Docker Issues

Examples

Multilingual Support

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Local RAG with Ollama

Table of Contents

Requirements

Setup

Docker Setup

Windows Users

Docker Configuration Notes:

Launch

Common Issues and Troubleshooting

PyMuPDF Installation Issues

Embedding Dimension Mismatch

Negative Relevance Scores

Model Loading Issues

Docker Issues

Examples

Multilingual Support

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages