RAG Docs

A Retrieval-Augmented Generation (RAG) application built with a modern tech stack, featuring a Next.js frontend and a FastAPI backend.

🧠 RAG Architecture & Pipeline

The application implements a robust Retrieval-Augmented Generation (RAG) pipeline to provide accurate, context-aware answers from your documents.

1. Ingestion Pipeline (`apps/backend/app/core/rag/rag.py`)

When a document is uploaded:

Loading: Documents are fetched from S3 using S3FileLoader.
Chunking: Text is split into manageable chunks (1000 characters) using RecursiveCharacterTextSplitter to ensure optimal context window usage.
Embedding: Each chunk is converted into a vector embedding using the BAAI/bge-small-en model via HuggingFaceEmbeddings. This model is optimized for retrieval tasks.
Storage: Vectors and metadata are stored in ChromaDB, a high-performance open-source vector database.

2. Retrieval & Generation (`apps/backend/app/core/utils/rag/llm.py`)

When a user asks a question:

Query Embedding: The user's question is converted into a vector using the same embedding model.
Semantic Search: ChromaDB performs a similarity search to find the most relevant document chunks.
Context Assembly: Retrieved chunks are combined with the conversation history (to support follow-up questions).
LLM Generation: The assembled context and user query are sent to Google Gemini 2.5 Flash (gemini-2.5-flash).
Response: The LLM generates a concise, accurate answer based only on the provided context, citing sources where possible.

Key Components

LLM: Google Gemini 2.5 Flash (via langchain-google-genai)
Embeddings: BAAI/bge-small-en (via langchain-huggingface)
Vector Store: ChromaDB
Orchestration: LangChain

🛠️ Prerequisites

Ensure you have the following installed:

Node.js (v18+)
Bun (npm install -g bun)
Python (v3.13+)
Docker & Docker Compose
uv (Recommended for Python dependency management)

🚀 Tech Stack

Frontend (`apps/web`)

Framework: Next.js 16 (App Router)
Language: TypeScript
Styling: Tailwind CSS 4
State Management: Zustand
Data Fetching: TanStack Query (React Query)
Icons: Lucide React

Backend (`apps/backend`)

Framework: FastAPI
Language: Python 3.13+
AI/ML:
Database:
- PostgreSQL (Application Data)
- ChromaDB (Vector Database)
Migrations: Alembic
Package Manager: uv

Infrastructure & Tools

Monorepo: Turborepo
Runtime: Bun (Frontend), Python (Backend)
Containerization: Docker & Docker Compose

📦 Installation

Clone the repository:
```
git clone <repository-url>
cd rag-docs
```
Install Frontend Dependencies:
```
bun install
```
Install Backend Dependencies:
```
cd apps/backend
uv sync
```

⚙️ Configuration

Backend

Navigate to apps/backend.
Copy the example environment file:
```
cp .env.example .env
```
Update .env with your API keys (Google GenAI, etc.) and database credentials.

Frontend

Navigate to apps/web.
Create a .env file (if not present) and configure necessary environment variables (e.g., API base URL).

🏃‍♂️ Running the Application

Option 1: Hybrid (Recommended for Dev)

Run the infrastructure (DBs) in Docker, and the apps locally for hot-reloading.

Start Databases (Postgres & Chroma):
```
docker-compose up -d db chroma
```

Start Backend:

cd apps/backend
# Apply migrations
alembic upgrade head
# Start server
uvicorn app.main:app --reload --port 8080
# OR if using just
just dev

Start Frontend: From the root directory:
```
bun dev
# OR
turbo run dev
```
The web app will be available at http://localhost:3000.

Option 2: Full Docker

Run the entire backend stack in Docker.

Start Backend & DBs:
```
docker-compose up -d --build
```
This starts Postgres, Chroma, and the FastAPI backend (on port 8080).
Start Frontend:
```
bun dev
```

📂 Project Structure

rag-docs/
├── apps/
│   ├── web/          # Next.js Frontend Application
│   └── backend/      # FastAPI Backend Application
├── packages/         # Shared packages (if any)
├── docker/           # Docker configurations
├── docker-compose.yml
├── turbo.json        # Turborepo configuration
└── package.json

📜 Scripts

bun dev: Start the development server (Frontend).
bun build: Build the application.
bun lint: Lint the codebase.
bun format: Format code using Prettier.
bun check-types: Run TypeScript type checking.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
apps		apps
docker		docker
packages		packages
.dockerignore		.dockerignore
.gitignore		.gitignore
.npmrc		.npmrc
LICENSE		LICENSE
README.md		README.md
bun.lock		bun.lock
docker-compose.yml		docker-compose.yml
package.json		package.json
turbo.json		turbo.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RAG Docs

🧠 RAG Architecture & Pipeline

1. Ingestion Pipeline (`apps/backend/app/core/rag/rag.py`)

2. Retrieval & Generation (`apps/backend/app/core/utils/rag/llm.py`)

Key Components

🛠️ Prerequisites

🚀 Tech Stack

Frontend (`apps/web`)

Backend (`apps/backend`)

Infrastructure & Tools

📦 Installation

⚙️ Configuration

Backend

Frontend

🏃‍♂️ Running the Application

Option 1: Hybrid (Recommended for Dev)

Option 2: Full Docker

📂 Project Structure

📜 Scripts

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

RAG Docs

🧠 RAG Architecture & Pipeline

1. Ingestion Pipeline (apps/backend/app/core/rag/rag.py)

2. Retrieval & Generation (apps/backend/app/core/utils/rag/llm.py)

Key Components

🛠️ Prerequisites

🚀 Tech Stack

Frontend (apps/web)

Backend (apps/backend)

Infrastructure & Tools

📦 Installation

⚙️ Configuration

Backend

Frontend

🏃‍♂️ Running the Application

Option 1: Hybrid (Recommended for Dev)

Option 2: Full Docker

📂 Project Structure

📜 Scripts

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages

1. Ingestion Pipeline (`apps/backend/app/core/rag/rag.py`)

2. Retrieval & Generation (`apps/backend/app/core/utils/rag/llm.py`)

Frontend (`apps/web`)

Backend (`apps/backend`)