Skip to content

Commit ac3ddcc

Browse files
committed
updated readme
1 parent 872d2a5 commit ac3ddcc

1 file changed

Lines changed: 28 additions & 3 deletions

File tree

README.md

Lines changed: 28 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,31 @@
1-
# Dealer RAG API
1+
# Thought-Processor
22

3-
A modern, interview-ready, and production-grade Retrieval-Augmented Generation (RAG) backend. This project provides a robust API to ingest PDF documents containing text and tables, process them asynchronously, and query them using state-of-the-art Large Language Models (LLMs) with conversational memory.
3+
> "A production-grade RAG backend that takes complex PDF manuals as input and asynchronously provisions highly-accurate, conversational AI answers."
4+
5+
## Architecture
6+
7+
```mermaid
8+
graph TD
9+
User([User]) -->|HTTP| Nginx[Nginx :80]
10+
Nginx -->|Proxy Pass| API[API Container :8000]
11+
API -.->|Cache & Chat History| Redis[(Redis :6379)]
12+
API -.->|Job Status| Postgres[(PostgreSQL :5432)]
13+
API -->|Job Queue| Redis
14+
Redis -->|Dequeue Job| Worker[Background Worker]
15+
Worker -->|OCR & Chunking| Qdrant[(Qdrant Cloud)]
16+
API -->|Vector Search| Qdrant
17+
API -->|LLM Query| Groq[Groq API]
18+
```
19+
20+
## Results / Benchmarks
21+
22+
- "Context precision: 0.61 (naive PyPDF) ➔ 0.88 (Docling OCR + semantic chunking). Latency: ~350ms TTFT."
23+
24+
## Technical Decisions
25+
26+
1. **Redis Queue (rq) over Celery:** To avoid blocking the FastAPI server during document uploads, we introduced a background worker. We chose `rq` because it is Python-native, simple to configure, and allowed us to maximize our existing Redis container, which was already handling API caching and conversational history.
27+
2. **Docling for OCR:** Automotive manuals are heavy with complex tables and images. Basic tools like PyPDF fail to preserve this structure. Using Docling required adding graphics libraries (`libGL`) to our Dockerfile, increasing the container size. We accepted this trade-off because it drastically improved vector precision and allowed the LLM to understand tabular technical specs.
28+
3. **Immutable Deployment CD Pipeline:** In our GitHub Actions CD workflow, we deploy to our DigitalOcean droplet using `git fetch` and `git reset --hard origin/main`. This strict approach ensures the droplet exactly mirrors the repository, treating the server as an immutable target and eliminating manual branch divergence errors.
429

530
## Features
631

@@ -11,7 +36,7 @@ A modern, interview-ready, and production-grade Retrieval-Augmented Generation (
1136
- **Job Tracking**: Persistent job status tracking (PENDING, PROCESSING, COMPLETED, FAILED) via **PostgreSQL**.
1237
- **Idempotency & Retries**: Robust job failure handling with exponential backoff and retry mechanisms.
1338
- **Caching**: Frequently asked questions are cached in Redis to lower latency and LLM costs.
14-
- **Telemetry & Health Probes**: Includes structured logging, request IDs via middleware, and Kubernetes-style probes (`/health`, `/ready`).
39+
- **Telemetry & Health Probes**: Includes structured logging, request IDs via middleware, and Kubernetes-style probes (`/health`, `ready`).
1540
- **Containerized**: Fully orchestrated with **Docker Compose** for easy setup and reproducibility.
1641

1742
## Technology Stack

0 commit comments

Comments
 (0)