In this project, I experimented with Multi-Layer Perceptrons (MLPs) and Convolutional Neural Networks (CNNs). The study begins with MLPs applied to a dataset of European countries and later transitions to image classification using CNNs to detect deepfake-generated images.
The dataset consists of tables with three columns:
- Longitude
- Latitude
- Country (encoded as an integer)
The goal is to classify a city into its corresponding country based on geographical coordinates.
Code implementation: NN_tutorial.py
A high learning rate can lead to oscillations in loss, making training unstable. Conversely, a lower learning rate ensures more stable learning. The figure below demonstrates these effects:
Too few epochs prevent the model from learning patterns effectively, leading to high losses. Too many epochs cause overfitting, observed around the 80-epoch mark.
Batch normalization stabilizes training, leading to faster convergence compared to models without it.
- Larger batch sizes achieve higher test accuracy.
- Smaller batch sizes result in slower training speeds per epoch.
- Stability: Larger batch sizes show smoother loss curves compared to smaller ones.
Code implementation: main.py
In this part of the project, I trained 8 classifiers for the following depths and widths {(1,16), (2, 16), (6, 16), (10, 16), (6, 8), (6, 32), (6, 64)} and optimized each model by tweaking the parameters.
- Best model: Achieved validation accuracy ≈ 0.75, with training, validation, and test losses stabilizing around 0.25.
- Worst model: Showed validation accuracy ≈ 0.50, with higher loss convergence (≈ 1.0).
- Optimal depth: 2 hidden layers
- More layers lead to diminishing returns due to the vanishing gradient problem.
- Optimal width: 30 neurons per hidden layer
- Too few or too many neurons reduce accuracy.
- Without batch normalization: Vanishing gradients in the early layers.
- With batch normalization: Gradients explode, requiring careful tuning.
When applying deep learning on some low dimensional data with a sequential pattern (e.g. time, location, etc.) it is common to first create an implicit representation of the data. This simply means passing the data through a series of sines and cosines. Sines and cosines are functions which may take a NN several layers to implement by itself, but are very useful. By doing this process ourselves we can leave the network to focus on more complex patterns which we can’t recognize and wish it to learn by itself. I implemented an implicit representation pre-processing to the data, passing the input through 10 sine functions and trained an NN with 6 and width 16 on top of these representations. Transforming input coordinates using sine and cosine functions improves decision boundaries, allowing the model to learn more complex patterns.
Code implementation: cnn.py
Using CNNs, I tackled binary classification to distinguish real human faces from deepfake-generated images. The dataset is available here.
Code implementation: xg.py
- Accuracy: 73.5%
- Epoch 1 Loss: 0.7758
- Validation Accuracy: 52.5%
- Test Accuracy: 52.25%
- Learning Rate: 0.01
- Epoch 1 Loss: 0.6795
- Validation Accuracy: 69.5%
- Test Accuracy: 72.5%
- Learning Rate: 0.01
- Epoch 1 Loss: 0.6244
- Validation Accuracy: 74.0%
- Test Accuracy: 77.5%
- Learning Rate: 0.001
- Best model: Fine-tuned ResNet18 (Test accuracy = 77.5%)
- Worst model: Training from scratch (Test accuracy = 52.25%)
Code implementation: five_samples.py
Five images correctly classified by the Fine-tuned model but misclassified by the Training from scratch model:
- MLPs work well for structured tabular data, but require careful tuning of depth, width, and batch normalization.
- CNNs are highly effective for image classification, especially with transfer learning (e.g., fine-tuning ResNet18).
- Fine-tuning a pretrained model significantly outperforms training from scratch.
- Implicit representation techniques enhance MLPs' ability to capture complex patterns.
Full report can be found in the ML_Methods_Ex4_Report.pdf



















