Machine Learning Methods - Experimenting with MLPs & CNNs

Project Summary

In this project, I experimented with Multi-Layer Perceptrons (MLPs) and Convolutional Neural Networks (CNNs). The study begins with MLPs applied to a dataset of European countries and later transitions to image classification using CNNs to detect deepfake-generated images.

Multi-Layer Perceptrons (MLPs)

Dataset

The dataset consists of tables with three columns:

Longitude
Latitude
Country (encoded as an integer)

The goal is to classify a city into its corresponding country based on geographical coordinates.

Optimizing MLP Training

Code implementation: NN_tutorial.py

Learning Rate

A high learning rate can lead to oscillations in loss, making training unstable. Conversely, a lower learning rate ensures more stable learning. The figure below demonstrates these effects:

Epochs

Too few epochs prevent the model from learning patterns effectively, leading to high losses. Too many epochs cause overfitting, observed around the 80-epoch mark.

Batch Normalization

Batch normalization stabilizes training, leading to faster convergence compared to models without it.

Batch Size

Larger batch sizes achieve higher test accuracy.
Smaller batch sizes result in slower training speeds per epoch.
Stability: Larger batch sizes show smoother loss curves compared to smaller ones.

Evaluating MLP Performance

Code implementation: main.py

In this part of the project, I trained 8 classifiers for the following depths and widths {(1,16), (2, 16), (6, 16), (10, 16), (6, 8), (6, 32), (6, 64)} and optimized each model by tweaking the parameters.

Best model: Achieved validation accuracy ≈ 0.75, with training, validation, and test losses stabilizing around 0.25.

Worst model: Showed validation accuracy ≈ 0.50, with higher loss convergence (≈ 1.0).

Depth vs. Accuracy

Optimal depth: 2 hidden layers
More layers lead to diminishing returns due to the vanishing gradient problem.

Width vs. Accuracy

Optimal width: 30 neurons per hidden layer
Too few or too many neurons reduce accuracy.

Monitoring Gradients

Without batch normalization: Vanishing gradients in the early layers.

With batch normalization: Gradients explode, requiring careful tuning.

Implicit Representation

When applying deep learning on some low dimensional data with a sequential pattern (e.g. time, location, etc.) it is common to first create an implicit representation of the data. This simply means passing the data through a series of sines and cosines. Sines and cosines are functions which may take a NN several layers to implement by itself, but are very useful. By doing this process ourselves we can leave the network to focus on more complex patterns which we can’t recognize and wish it to learn by itself. I implemented an implicit representation pre-processing to the data, passing the input through 10 sine functions and trained an NN with 6 and width 16 on top of these representations. Transforming input coordinates using sine and cosine functions improves decision boundaries, allowing the model to learn more complex patterns.

Convolutional Neural Networks (CNNs)

Code implementation: cnn.py

Task: Deepfake Image Classification

Using CNNs, I tackled binary classification to distinguish real human faces from deepfake-generated images. The dataset is available here.

Model Comparisons

XGBoost (Baseline)

Code implementation: xg.py

Accuracy: 73.5%

Training from Scratch (ResNet18)

Epoch 1 Loss: 0.7758
Validation Accuracy: 52.5%
Test Accuracy: 52.25%
Learning Rate: 0.01

Linear Probing (Pretrained ResNet18)

Epoch 1 Loss: 0.6795
Validation Accuracy: 69.5%
Test Accuracy: 72.5%
Learning Rate: 0.01

Fine-Tuning (Pretrained ResNet18)

Epoch 1 Loss: 0.6244
Validation Accuracy: 74.0%
Test Accuracy: 77.5%
Learning Rate: 0.001

Best vs. Worst Model Comparison

Best model: Fine-tuned ResNet18 (Test accuracy = 77.5%)
Worst model: Training from scratch (Test accuracy = 52.25%)

Sample Analysis

Code implementation: five_samples.py Five images correctly classified by the Fine-tuned model but misclassified by the Training from scratch model:

Results & Insights

Key Findings

MLPs work well for structured tabular data, but require careful tuning of depth, width, and batch normalization.
CNNs are highly effective for image classification, especially with transfer learning (e.g., fine-tuning ResNet18).
Fine-tuning a pretrained model significantly outperforms training from scratch.
Implicit representation techniques enhance MLPs' ability to capture complex patterns.

Full report can be found in the ML_Methods_Ex4_Report.pdf

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
Exercise_4.pdf		Exercise_4.pdf
LICENSE		LICENSE
ML_Methods_Ex4_Report.pdf		ML_Methods_Ex4_Report.pdf
NN_tutorial.py		NN_tutorial.py
README		README
README.md		README.md
cnn.py		cnn.py
five_samples.py		five_samples.py
helpers.py		helpers.py
main.py		main.py
test.csv		test.csv
train.csv		train.csv
validation.csv		validation.csv
xg.py		xg.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Machine Learning Methods - Experimenting with MLPs & CNNs

Project Summary

Multi-Layer Perceptrons (MLPs)

Dataset

Optimizing MLP Training

Learning Rate

Epochs

Batch Normalization

Batch Size

Evaluating MLP Performance

Depth vs. Accuracy

Width vs. Accuracy

Monitoring Gradients

Implicit Representation

Convolutional Neural Networks (CNNs)

Task: Deepfake Image Classification

Model Comparisons

XGBoost (Baseline)

Training from Scratch (ResNet18)

Linear Probing (Pretrained ResNet18)

Fine-Tuning (Pretrained ResNet18)

Best vs. Worst Model Comparison

Sample Analysis

Results & Insights

Key Findings

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Machine Learning Methods - Experimenting with MLPs & CNNs

Project Summary

Multi-Layer Perceptrons (MLPs)

Dataset

Optimizing MLP Training

Learning Rate

Epochs

Batch Normalization

Batch Size

Evaluating MLP Performance

Depth vs. Accuracy

Width vs. Accuracy

Monitoring Gradients

Implicit Representation

Convolutional Neural Networks (CNNs)

Task: Deepfake Image Classification

Model Comparisons

XGBoost (Baseline)

Training from Scratch (ResNet18)

Linear Probing (Pretrained ResNet18)

Fine-Tuning (Pretrained ResNet18)

Best vs. Worst Model Comparison

Sample Analysis

Results & Insights

Key Findings

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages