This project focuses on the analysis and development of complex models for image recognition using the FoodX_251 dataset. Throughout the project, various machine learning models were explored, performance on weak datasets was analyzed, and several aspects related to image quality were addressed.
The first part of the project is dedicated to building a complex model that combines an ensemble of pre-trained models with a custom Convolutional Neural Network (CNN) for food recognition and classification from images.
The second part focuses on improving a degraded dataset. We assume a scenario where the available images are compromised by noise. The goal is to optimize classification performance both by enhancing the quality of the available data and by training classifiers directly on degraded images.
utils/: Contains therequirements.txtfile for setting up the virtual environment.dataset/: Contains the necessary data for the project. Download and extract the following datasets into this folder:train_settest_setval_set_degraded
FoodX_251/: Includes data analysis, exploration of simple models, and the creation of the complex model.ground_truth/: Contains the ground truth files. Some of them have been modified to handle missing images and simplify data splitting processes.models/: Contains the model classes used and discarded throughout the project. Thetrained_models/subfolder includes the pre-trained models. Files ending with-1refer to noisy models.scripts/: Contains essential classes and scripts fundamental to the project.graphs/: Contains training graphs for the various models.
-
Create a virtual environment:
python -m venv environment_name
-
Activate the virtual environment:
- On Windows:
environment_name\Scripts\activate
- On Linux/MacOS:
source environment_name/bin/activate
- On Windows:
-
Install dependencies:
The
requirements.txtfile is located in theutilsfolder. To install all dependencies, run:pip install -r utils/requirements.txt
-
Prepare the data: Download and extract the following datasets into the
dataset/folder:train_settest_setval_set_degraded
-
Run data analysis: The data analysis is included in the
FoodX_251folder. This folder also contains the exploration of simple models and the creation of the complex model. Analysis related to degraded images can be found in thedegraded_images.ipynbnotebook. -
Model training: Models are located in the
models/folder. Pre-trained models are available in thetrained_models/subfolder. You can also train models from scratch using the source code available in thescripts/folder. -
Training graphs: The training graphs for various models are located in the
graphs/folder. -
Updating Training Images and Cyclic Training of the Ensemble
The script ensamble_image_increment.py manages the update of training images and cyclic training of the ensemble. This process consists of progressively adding new images to the training dataset, thus improving the model's accuracy over time.
To update the training images and train the ensemble cyclically, simply run the script. The script will:
- Add new images to the training dataset.
- Retrain the ensemble models on the updated dataset.
Example of execution:
python ensamble_image_increment.pyThe EnsambleModel class allows you to create an ensemble model, train individual models, and make predictions using the combined weight of each model.
To create an object of the EnsambleModel class, you must provide the following parameters:
models_name: a list with the names of the models to be used in the ensemble.pre_trained: if set toTrue, loads the pre-trained weights for the models. IfFalse, no pre-trained weights are loaded.models_weights: a list of weights for each model. The sum of the weights determines each model’s importance during prediction. This parameter is useful only ifpre_trained=True.num_classes: the number of classes in the dataset (default: 251).
Example of initialization:
ensemble = EnsambleModel(
models_name=['resnet', 'efficientnet', 'vgg'],
pre_trained=True,
models_weights=[0.3, 0.4, 0.3],
num_classes=251
)To train the models in the ensemble, use the train_ensamble() method. The required parameters are:
train_dataset: the training dataset (an object of theImageDatasetclass).lr: learning rate.num_epochs: the number of training epochs (default: 10).lc: additional parameters for training management, if necessary.
Example of training:
train_losses, val_losses, train_accuracies, val_accuracies = ensemble.train_ensamble(
train_dataset=train_data,
lr=0.001,
num_epochs=10
)This method trains the models specified in models_name and returns the losses and accuracies for both training and validation.
To make predictions on the data, use the predict() method. Required parameters are:
image_dataset: an object of theImageDatasetclass containing the images to predict.lc: additional parameters for prediction management, if necessary.
Example of prediction:
images_idx, images_label, predictions_confidences = ensemble.predict(image_dataset=test_data)The method returns three lists:
images_idx: the IDs of the images.images_label: the predicted labels.predictions_confidences: the probabilities associated with the predicted labels.
If pre_trained=True, the models are loaded from the pre-trained weights stored in the ./models/trained_models/ folder. The model name must match the name of the weight file (e.g., resnet_-1.pth for the ResNet model).
Files ending with -1 refer to models trained on noisy data. If you do not want to use noisy models, you can omit these files from the pre-trained weights.