Model Microservice

Machine learning service for training and inference using scikit-learn models.

🚀 Quick Start (Zero Configuration)

# From the project root directory
docker compose -f 'compose/compose.dev.yaml' up -d --build

# Access Swagger UI
open http://localhost:8001/docs  # Development mode

No setup needed! The service starts with pre-trained models ready for inference.

📋 Features

5 ML Algorithms: Random Forest, SVM, Decision Tree, KNN, Logistic Regression
Model Training: Train models with configurable feature selection
Model Persistence: Automatic saving and loading of trained models
RESTful API: Full Swagger/OpenAPI documentation

🏗️ API Documentation

Interactive API Explorer

Access the Swagger UI at: http://localhost:8001/docs (development mode)

Main Endpoints

GET /health - Service health check
GET /models - List all trained models with metadata
POST /models/train - Train a new model
POST /models/{id}/predict - Get prediction from specific model
DELETE /models/{id} - Delete a trained model

🤖 Available Algorithms

Algorithm	ID	Configurable Parameters
Random Forest	`rf`	`n_estimators`
Support Vector Machine	`svm`	-
Decision Tree	`dt`	-
K-Nearest Neighbors	`knn`	`n_neighbors`
Logistic Regression	`lr`	-

📊 Available Features

Features from the Titanic dataset (select which ones to use during training):

pclass - Passenger class (1, 2, 3)
sex - Gender (male/female)
age - Age in years
fare - Ticket fare
embarked - Port of embarkation
title - Extracted from name (Mr, Mrs, etc.)
is_alone - Traveling alone flag
age_class - Age × Class interaction

🛠️ Development Workflow

Testing the API with Swagger

Go to http://localhost:8001/docs
Try /models to see pre-loaded models

Test prediction with /models/{model_id}/predict:

{
  "pclass": 1,
  "sex": "female",
  "age": 30,
  "fare": 100,
  "travelled_alone": false,
  "embarked": "cherbourg",
  "title": "mrs"
}

Train a custom model with /models/train

🧪 Testing

cd model

# Install dependencies (if not already done)
uv sync --extra dev

# Run tests
uv run pytest

# Linting and formatting check
uv run ruff check
uv run ruff format --check

# Auto-fix formatting
uv run ruff format

📁 Project Structure

model/
├── main.py              # FastAPI application
├── models_router.py     # Model management endpoints
├── schemas.py           # Pydantic data models
├── train.py            # Training script
├── utils/
│   ├── data.py         # Data preprocessing
│   ├── models.py       # Model loading/saving
│   └── model_factory.py # Algorithm factory
├── data/               # Included dataset
│   ├── train.csv
│   ├── test.csv
│   └── gender_submission.csv
└── tests/              # Test suite

🔧 Model Training

Using Swagger UI

Go to http://localhost:8001/docs
Expand POST /models/train
Click "Try it out"

Use this example request:

{
  "algo": {
    "name": "rf",
    "n_estimators": 150
  },
  "features": ["pclass", "sex", "age", "fare"],
  "random_state": 42
}

Click "Execute"

Training Response

{
  "id": "trained-abc123",
  "params": { ... },
  "info": {
    "accuracy": 0.85
  }
}

📈 Making Predictions

Using Swagger UI

Get model ID from /models endpoint
Use POST /models/{model_id}/predict
Provide passenger data
Receive survival prediction with probability

Prediction Request

{
  "pclass": 3,
  "sex": "male",
  "age": 25,
  "fare": 15.5,
  "travelled_alone": true,
  "embarked": "southampton",
  "title": "mr"
}

Prediction Response

{
  "survived": false,
  "probability": 0.78
}

💾 Data Management

Model Storage

Models automatically saved to /data/models/
Persisted across container restarts
Each model includes:
- model.pkl - Serialized scikit-learn model
- params.json - Training parameters
- info.json - Model metadata and accuracy

Pre-loaded Models

On startup, the service loads:

rf - Random Forest
svm - Support Vector Machine
knn - K-Nearest Neighbors
lr - Logistic Regression

🐳 Production Deployment

The service is production-ready when deployed via:

docker compose -f compose/compose.prod-local.yaml up

Name		Name	Last commit message	Last commit date
Latest commit History 58 Commits
.github/workflows		.github/workflows
container		container
data		data
tests		tests
training		training
utils		utils
.dockerignore		.dockerignore
.gitignore		.gitignore
.gitlab-ci.yml		.gitlab-ci.yml
.python-version		.python-version
Dockerfile		Dockerfile
README.md		README.md
compose.dev.yaml		compose.dev.yaml
main.py		main.py
models_router.py		models_router.py
pyproject.toml		pyproject.toml
schemas.py		schemas.py
train.py		train.py
uv.lock		uv.lock

random-iceberg/model-backend

Folders and files

Latest commit

History

Repository files navigation

Model Microservice

🚀 Quick Start (Zero Configuration)

📋 Features

🏗️ API Documentation

Interactive API Explorer

Main Endpoints

🤖 Available Algorithms

📊 Available Features

🛠️ Development Workflow

Testing the API with Swagger

🧪 Testing

📁 Project Structure

🔧 Model Training

Using Swagger UI

Training Response

📈 Making Predictions

Using Swagger UI

Prediction Request

Prediction Response

💾 Data Management

Model Storage

Pre-loaded Models

🐳 Production Deployment

🔍 Troubleshooting

Model Not Found

Slow Predictions

Training Failures

Note on random_state

📚 Additional Resources

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Packages 0

Uh oh!

Uh oh!

Contributors 4

Uh oh!

Languages

Packages