Simple LLMOps with Ollama

A basic LLMOps application using FastAPI, LangChain, ChromaDB, and Ollama for a simple insurance chatbot.

Features

FastAPI web API with chat endpoints
Ollama integration for local LLM inference
ChromaDB for vector storage and retrieval
Document loading and indexing
RAG (Retrieval-Augmented Generation) capabilities
Health checks and monitoring

Prerequisites

Python 3.12
Ollama installed and running
macOS (tested) or Linux

Quick Start

Clone and setup:

git clone <your-repo>
cd Introduction-to-LLMOps
./setup.sh

Add documents:

# Add your documents to data/documents/
# Supports .txt and .md files

Index documents:
```
python load_documents.py
```
Start the API:
```
uvicorn app.main:app --reload
```
Test the API:
- Visit: http://localhost:8000/docs
- Health check: http://localhost:8000/health
- Chat: POST to http://localhost:8000/chat

API Endpoints

GET / - Root endpoint
GET /health - Health check
GET /info - System information
POST /chat - Chat with the bot

Chat Request Example

{
  "message": "How do I file an auto insurance claim?",
  "use_context": true
}

Chat Response Example

{
  "response": "To file an auto insurance claim, you should...",
  "sources": ["data/documents/insurance_faq.md"]
}

Configuration

Edit .env file to configure:

# Ollama Configuration
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL=gemma2:2b

# Vector Store
CHROMA_PERSIST_DIRECTORY=./data/vector_store

# App Settings
API_TITLE=Simple Insurance Chatbot
API_VERSION=1.0.0

Directory Structure

├── app/
│   └── main.py              # FastAPI application
├── data/
│   ├── documents/           # Place your documents here
│   └── vector_store/        # ChromaDB storage
├── load_documents.py        # Document indexing script
├── setup.sh                 # Setup script
├── run.sh                   # Test script
├── requirements.txt         # Python dependencies
└── .env                     # Configuration

Dependencies

Core packages:

fastapi - Web framework
langchain - LLM framework
langchain_ollama - Ollama integration
langchain_chroma - ChromaDB integration
chromadb - Vector database
python-dotenv - Environment management

Development

Test environment: ./run.sh
Start development server: uvicorn app.main:app --reload
Add new documents: Add files to data/documents/ and run python load_documents.py

Troubleshooting

Ollama not running:
```
ollama serve
```
Model not available:
```
ollama pull gemma2:2b
```
Dependencies issues:
```
pip install -r requirements.txt
```
Check health:
```
curl http://localhost:8000/health
```

Next Steps

This is a basic setup. You can extend it by:

Adding more document types
Implementing user authentication
Adding conversation memory
Implementing evaluation frameworks
Adding web UI
Deploying to production

License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
app		app
chat-interface		chat-interface
data		data
tests		tests
.gitignore		.gitignore
README.md		README.md
load_documents.py		load_documents.py
requirements.txt		requirements.txt
run.sh		run.sh
setup.sh		setup.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Simple LLMOps with Ollama

Features

Prerequisites

Quick Start

API Endpoints

Chat Request Example

Chat Response Example

Configuration

Directory Structure

Dependencies

Development

Troubleshooting

Next Steps

License

About

Uh oh!

Releases

Packages

Languages

juice1000/Introduction-to-LLMOps

Folders and files

Latest commit

History

Repository files navigation

Simple LLMOps with Ollama

Features

Prerequisites

Quick Start

API Endpoints

Chat Request Example

Chat Response Example

Configuration

Directory Structure

Dependencies

Development

Troubleshooting

Next Steps

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages