Skip to content

yfrist96/ML_Methods_Project

Repository files navigation

Machine Learning Methods - Experimenting with MLPs & CNNs

Project Summary

In this project, I experimented with Multi-Layer Perceptrons (MLPs) and Convolutional Neural Networks (CNNs). The study begins with MLPs applied to a dataset of European countries and later transitions to image classification using CNNs to detect deepfake-generated images.


Multi-Layer Perceptrons (MLPs)

Dataset

The dataset consists of tables with three columns:

  • Longitude
  • Latitude
  • Country (encoded as an integer)

The goal is to classify a city into its corresponding country based on geographical coordinates.

Optimizing MLP Training

Code implementation: NN_tutorial.py

Learning Rate

A high learning rate can lead to oscillations in loss, making training unstable. Conversely, a lower learning rate ensures more stable learning. The figure below demonstrates these effects:

Learning Rate Effects

Epochs

Too few epochs prevent the model from learning patterns effectively, leading to high losses. Too many epochs cause overfitting, observed around the 80-epoch mark.

Epochs Effect

Batch Normalization

Batch normalization stabilizes training, leading to faster convergence compared to models without it.

Batch Normalization

Batch Size

  • Larger batch sizes achieve higher test accuracy.
  • Smaller batch sizes result in slower training speeds per epoch.
  • Stability: Larger batch sizes show smoother loss curves compared to smaller ones.

Batch Size Effect

Evaluating MLP Performance

Code implementation: main.py

In this part of the project, I trained 8 classifiers for the following depths and widths {(1,16), (2, 16), (6, 16), (10, 16), (6, 8), (6, 32), (6, 64)} and optimized each model by tweaking the parameters.

  • Best model: Achieved validation accuracy ≈ 0.75, with training, validation, and test losses stabilizing around 0.25.

Light         Dark

  • Worst model: Showed validation accuracy ≈ 0.50, with higher loss convergence (≈ 1.0).

Light         Dark

Depth vs. Accuracy

  • Optimal depth: 2 hidden layers
  • More layers lead to diminishing returns due to the vanishing gradient problem.

Depth vs Accuracy

Width vs. Accuracy

  • Optimal width: 30 neurons per hidden layer
  • Too few or too many neurons reduce accuracy.

Width vs Accuracy

Monitoring Gradients

  • Without batch normalization: Vanishing gradients in the early layers.

Gradient vs Epoch

  • With batch normalization: Gradients explode, requiring careful tuning.

Gradient vs Epoch with BatchNorm

Implicit Representation

When applying deep learning on some low dimensional data with a sequential pattern (e.g. time, location, etc.) it is common to first create an implicit representation of the data. This simply means passing the data through a series of sines and cosines. Sines and cosines are functions which may take a NN several layers to implement by itself, but are very useful. By doing this process ourselves we can leave the network to focus on more complex patterns which we can’t recognize and wish it to learn by itself. I implemented an implicit representation pre-processing to the data, passing the input through 10 sine functions and trained an NN with 6 and width 16 on top of these representations. Transforming input coordinates using sine and cosine functions improves decision boundaries, allowing the model to learn more complex patterns.

Light         Dark

Decision Boundaries


Convolutional Neural Networks (CNNs)

Code implementation: cnn.py

Task: Deepfake Image Classification

Using CNNs, I tackled binary classification to distinguish real human faces from deepfake-generated images. The dataset is available here.

Model Comparisons

XGBoost (Baseline)

Code implementation: xg.py

  • Accuracy: 73.5%

Training from Scratch (ResNet18)

  • Epoch 1 Loss: 0.7758
  • Validation Accuracy: 52.5%
  • Test Accuracy: 52.25%
  • Learning Rate: 0.01

Linear Probing (Pretrained ResNet18)

  • Epoch 1 Loss: 0.6795
  • Validation Accuracy: 69.5%
  • Test Accuracy: 72.5%
  • Learning Rate: 0.01

Fine-Tuning (Pretrained ResNet18)

  • Epoch 1 Loss: 0.6244
  • Validation Accuracy: 74.0%
  • Test Accuracy: 77.5%
  • Learning Rate: 0.001

Best vs. Worst Model Comparison

  • Best model: Fine-tuned ResNet18 (Test accuracy = 77.5%)
  • Worst model: Training from scratch (Test accuracy = 52.25%)

Sample Analysis

Code implementation: five_samples.py Five images correctly classified by the Fine-tuned model but misclassified by the Training from scratch model:

Light         Dark

Light         Dark


Results & Insights

Key Findings

  • MLPs work well for structured tabular data, but require careful tuning of depth, width, and batch normalization.
  • CNNs are highly effective for image classification, especially with transfer learning (e.g., fine-tuning ResNet18).
  • Fine-tuning a pretrained model significantly outperforms training from scratch.
  • Implicit representation techniques enhance MLPs' ability to capture complex patterns.

Full report can be found in the ML_Methods_Ex4_Report.pdf


About

Experimenting with MLP's and CNN's

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages