LLM Agent

A lightweight, distributed retrieval system using Dense Passage Retrieval (DPR). Send prompts to a router, which forwards them to healthy workers running DPR models for PDF document retrieval.

What You Get

✅ Distributed - Multiple workers share the load
✅ Reliable - Automatic failover and retries
✅ Observable - Built-in logging and error tracking
✅ Simple - ~500 lines of clean code
✅ Fast - Ready in seconds, retrieval in 1-5 seconds
✅ DPR-Powered - Facebook's Dense Passage Retrieval for accurate document retrieval

Start

# 1. Build
make build

# 2. Run
make up

# 3. Test
make test

# Done! Open another terminal to check logs
make logs

Stop with:

make down

How to Use

Send a Message

curl -X POST http://localhost:8000/chat \
  -H "Content-Type: application/json" \
  -d '{"prompt": "Hello, world!"}'

Response:

{
  "results": [
    {
      "passage": "Hello, world! This is a sample document...",
      "score": 0.92
    }
  ]
}

Check Status

# Router health
curl http://localhost:8000/health

# Worker status
curl http://localhost:8000/workers

# Error count
curl http://localhost:8000/errors

Configuration

Copy .env.example to .env and edit:

cp .env.example .env

Key settings:

DPR_QUESTION_ENCODER - Question encoder model (default: facebook/dpr-question_encoder-single-nq-base)
DPR_CONTEXT_ENCODER - Context encoder model (default: facebook/dpr-ctx_encoder-single-nq-base)
PDF_DIR - Directory containing PDF corpus (default: pdf_corpus)
DATA_DIR - Directory for persisting embeddings (default: data)
REQUEST_TIMEOUT - Seconds to wait for response (default: 30)
LOG_LEVEL - How much to log (default: INFO)

What's Running?

Router (port 8000): Receives requests, picks a worker
Worker 1 (port 5000): Runs DPR models for document retrieval
Worker 2 (port 5000): Backup worker for redundancy

All three run in Docker containers. Workers index PDFs from pdf_corpus/ directory.

What Files Do What?

File	Purpose
`router/app.py`	Request routing (85 lines)
`worker/app.py`	DPR retrieval over PDFs (270 lines)
`utils/logging_config.py`	Logging setup (52 lines)
`config.py`	Configuration (35 lines)
`docker-compose.yml`	Local deployment
`swarm-stack.yml`	Distributed deployment

Common Commands

make build          # Build Docker images
make up             # Start services
make down           # Stop services
make logs           # Watch all logs
make logs-router    # Router logs only
make logs-worker    # Worker logs only
make health         # Check status
make test           # Send test message
make errors         # Show errors
make clean          # Clean up

Troubleshooting

"No workers available"

Workers are starting. Wait 30 seconds, then try again.

make logs-worker    # Check worker logs
make health         # Check status

"Timeout error"

Response took too long. Increase timeout in .env:

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
guides		guides
pdf_corpus		pdf_corpus
router		router
routes		routes
utils		utils
worker		worker
.env.example		.env.example
.gitignore		.gitignore
Makefile		Makefile
config.py		config.py
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM Agent

What You Get

Start

How to Use

Send a Message

Check Status

Configuration

What's Running?

What Files Do What?

Common Commands

Troubleshooting

"No workers available"

"Timeout error"

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LLM Agent

What You Get

Start

How to Use

Send a Message

Check Status

Configuration

What's Running?

What Files Do What?

Common Commands

Troubleshooting

"No workers available"

"Timeout error"

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages