This project demonstrates an end-to-end Machine Learning Operations (MLOps) workflow using MLflow for experiment tracking, model management, artifact generation, and model registration.
The repository provides a modular environment for training multiple machine learning models, logging experiments, comparing their performance, and managing registered models through the MLflow web interface.
This project implements a complete workflow for machine learning experiment management using MLflow.
The repository includes scripts for:
- Exploratory Data Analysis (EDA).
- Feature visualization.
- Training multiple machine learning algorithms.
- Logging metrics, parameters, and artifacts.
- Registering trained models.
- Comparing model performance through the MLflow Tracking UI.
The experiments are conducted using the Breast Cancer Wisconsin Dataset, where several supervised classification algorithms are trained and evaluated.
The entire application is containerized using Docker, providing a reproducible environment for experimentation.
- 📊 Experiment tracking with MLflow.
- 🧠 Training multiple machine learning models.
- 📦 Automatic artifact generation.
- 🗂️ Model registry support.
- 🐳 Docker-based deployment.
- 🖥️ Compatible with Windows.
This project uses:
- Python 3.11+
- Pandas
- NumPy
- Matplotlib
- MLflow
- Docker
- Python 3.11+
- Docker Desktop
- Internet connection
git clone https://github.com/fabriciolopretto/Administracion-de-Modelos.git
cd Administracion-de-ModelosEnsure Docker Desktop is running before building the container.
From the directory containing the Dockerfile, execute:
docker build -t image_name .
docker run -it \
--name container_name \
-p 5000:5000 \
-v "$(pwd)/TP_Final/mlflow/experiments/models/mlruns:/app/mlruns" \
image_namepython Distributions.py
python HeatMap.pypython RegLog.py
python KNN.py
python SVC.py
python TreeClasf.pypython registro_reg_log.py
python predictions_reglog_model.pyOnce the container is running, open your browser and navigate to:
http://localhost:5000
The repository includes:
- Machine learning training scripts.
- Exploratory data analysis notebooks and scripts.
- MLflow experiment logs.
- Registered model artifacts.
- Dockerfile for reproducible deployment.
requirements.txtconfig.inifor PostgreSQL configuration.- MLflow run history and artifacts.
The experiments use the Breast Cancer Wisconsin Dataset, which contains numerical features extracted from digitized images of breast cell nuclei. The objective is to classify tumors as benign or malignant.
For questions, suggestions, or collaborations, please feel free to get in touch.
