LLMOps Setup Course

This repository demonstrates a production-ready LLM application with model fallback, monitoring, and testing.

Architecture

FastAPI Application: REST API for LLM interactions with cascade fallback
LiteLLM Proxy: Unified interface for multiple LLM providers (OpenAI, Gemini, OpenRouter)
MLflow: Experiment tracking and prompt tracing

Prerequisites

Docker and Docker Compose
API keys for:
- OpenAI (GPT-4o)
- Gemini 2.0 Flash
- OpenRouter (Mistral 7B fallback)

Quick Start

Setup environment:

cp env.example .env
# Edit .env with your API keys

Start services:
```
docker-compose up -d --build
```
Access services:
- API: http://localhost:8000
- LiteLLM: http://localhost:8001
- MLflow UI: http://localhost:5000

API Endpoints

Text Generation

POST /generate
Content-Type: application/json

{
  "prompt": "Your prompt here",
  "model": "smart-router",  # Uses cascade fallback
  "temperature": 0.7
}

Available Models

GET /models

Health Check

GET /health

Model Fallback Strategy

Primary: gpt-4o-primary (OpenAI GPT-4o)
Secondary: gemini-secondary (Gemini 2.0 Flash)
Fallback: openrouter-fallback (Mistral 7B via OpenRouter)

Use smart-router model name to enable automatic fallback.

Monitoring with MLflow

All LLM calls are tracked with:

Input/Output parameters
Token usage and latency
Success/Failure status
Full prompt/response history

Access the MLflow UI at http://localhost:5000

Project Structure

.
├── docker-compose.yml      # Service definitions
├── litellm-config.yaml    # LiteLLM model configuration
├── .env.example           # Template for environment variables
├── test-requirements.txt  # Testing dependencies
├── tests/                 # Integration tests
├── mlflow-data/           # MLflow experiment data
└── src/
    └── api/               # FastAPI application
        ├── main.py        # API endpoints
        └── Dockerfile     # API container setup

Development

Running Tests

Tests run inside the container:

docker-compose exec api pytest /app/tests/

Stopping Services

docker-compose down

Viewing Logs

docker-compose logs -f

Data Persistence

MLflow data: ./mlflow-data
Test coverage reports: ./htmlcov

Makefile

The Makefile provides a set of commands to manage the environment and run tests. Here are the available commands:

Note: jq is required to parse the API responses.

# Check API health
make api-test

# List available models
make api-models

# Generate text with fallback model
make api-generate PROMPT="What is the capital of France?"

# Generate text specifically with Gemini
make api-generate-gemini PROMPT="Explain quantum computing in simple terms"

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
litellm		litellm
src		src
tests		tests
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
docker-compose.yml		docker-compose.yml
env.example		env.example
pyproject.toml		pyproject.toml
test-requirements.txt		test-requirements.txt
test_traces.py		test_traces.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLMOps Setup Course

Architecture

Prerequisites

Quick Start

API Endpoints

Text Generation

Available Models

Health Check

Model Fallback Strategy

Monitoring with MLflow

Project Structure

Development

Running Tests

Stopping Services

Viewing Logs

Data Persistence

Makefile

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

License

DataScientest/LLMOps-setup-course

Folders and files

Latest commit

History

Repository files navigation

LLMOps Setup Course

Architecture

Prerequisites

Quick Start

API Endpoints

Text Generation

Available Models

Health Check

Model Fallback Strategy

Monitoring with MLflow

Project Structure

Development

Running Tests

Stopping Services

Viewing Logs

Data Persistence

Makefile

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages