Skip to content

metorresponce/ml-mini-pipeline

Available languages / Idiomas disponibles: English / Español

Back to repository: Home

ci model-ci last commit release license: MIT stars

Machine Learning Mini Pipeline

Minimal Machine Learning pipeline built with scikit-learn for portfolio purposes.

It generates synthetic data, trains a LogisticRegression model, produces artifacts (model + metrics), and validates everything in GitHub Actions.

What it does

  • Generates a synthetic dataset
  • Trains a model (LogisticRegression)
  • Writes run artifacts to ./artifacts/
    • model.pkl
    • metrics.json
  • Runs pytest in CI to validate the pipeline and the metrics schema

Validate 100% online (GitHub Actions)

No local setup is required to evaluate this repo.

  1. Push this repo to GitHub
  2. Go to Actions -> tests (ci.yml) -> Run workflow
  3. Go to Actions -> model-ci (model-ci.yml) -> Run workflow (optional)
  4. Both workflows should turn green

Evidence: workflow logs and downloadable artifacts (if enabled).

Generated artifacts

artifacts/
├── model.pkl
└── metrics.json

Example metrics.json:

{
  "accuracy": 0.89,
  "precision": 0.88,
  "recall": 0.91,
  "f1": 0.89,
  "timestamp": "2025-08-24T18:00:00Z",
  "model_type": "LogisticRegression"
}

CI (GitHub Actions)

  • ci.yml (tests)

    • Installs dependencies
    • Runs pytest (unit tests)
    • Validates code quality gates for the pipeline
  • model-ci.yml (model training)

    • Trains the model in CI
    • Generates artifacts (model.pkl + metrics.json)
    • Optionally uploads artifacts from the workflow run

If you want to publish artifacts from CI, add this step at the end of the workflow job:

- name: Upload artifacts (model + metrics)
  uses: actions/upload-artifact@v4
  with:
    name: artifacts
    path: artifacts/**

Local run (optional)

Local execution is optional and mainly useful for development.

python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt -r requirements-dev.txt

python -c "from src.pipeline import train_and_evaluate; train_and_evaluate()"
pytest -q

Structure

.
├── artifacts/               # generated by the pipeline (gitignored)
├── src/
│   ├── __init__.py
│   └── pipeline.py          # orchestration (train_and_evaluate)
├── tests/
│   ├── test_metrics_schema.py
│   └── test_pipeline.py
├── requirements.txt
├── requirements-dev.txt
└── .github/workflows/
    ├── ci.yml
    └── model-ci.yml

Releases

Create a release from Releases -> Draft new release (e.g. v0.1.0). Tag it as Latest so it stays visible on the repository header.

Credits

Portfolio repository by @metorresponce. Inspired by minimalist MLOps practices (synthetic data + artifacts + CI).

See also: Code of Conduct · Contributing · Security

About

Mini ML pipeline with scikit-learn (Iris). Trains in CI, saves metrics, and uploads the model as an artifact.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors