This repository documents my work and key learnings from the NVIDIA Deep Learning Institute's "Fundamentals of Deep Learning" course. The main project involved building, training, and fine-tuning a Convolutional Neural Network (CNN) to classify images of fruit as either fresh or rotten.
The model was successfully trained to identify 6 distinct categories (e.g., fresh apple, rotten apple, fresh banana, etc.) using a dataset from Kaggle.
This project provided hands-on experience with several core deep learning concepts:
-
Transfer Learning: I used a pre-trained model and initially froze its base layers. This technique leverages powerful, generalized features (like edges, shapes, and colors) learned from the massive ImageNet dataset. By training only the final classification layer (the "head"), we can achieve high accuracy with less data and computational cost.
-
Data Augmentation: To prevent overfitting and create a more robust model, the training data was augmented with random transformations. This process artificially expands the dataset by creating modified versions of the images, teaching the model to generalize better.
-
Model Fine-Tuning: After the initial training, the entire model was "unfrozen," and training continued with a lower learning rate. This step fine-tunes the pre-trained layers, allowing them to adapt more specifically to the features of our fruit dataset, which ultimately boosted the model's accuracy.
The following transformations were applied to the training images to diversify the dataset:
random_trans = transforms.Compose([
# Resize all images to a consistent input size
transforms.Resize((IMG_WIDTH, IMG_HEIGHT)),
# Apply random transformations for augmentation
transforms.RandomHorizontalFlip(p=0.5),
transforms.RandomRotation(15),
transforms.ColorJitter(brightness=0.2, contrast=0.2, saturation=0.2),
# Convert image to a tensor and normalize
transforms.ToTensor(),
transforms.Normalize(
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]
) # Standard ImageNet normalization
])epochs = 10
for epoch in range(epochs):
print('Epoch: {}'.format(epoch))
utils.train(my_model, train_loader, train_N, random_trans, optimizer, loss_function)
utils.validate(my_model, valid_loader, valid_N, loss_function)- Epochs: 10
- Batch Size: 64
The final model performed exceptionally well, surpassing the assessment's requirements.
- Accuracy Required: > 0.92
- Final Model Accuracy: 0.9514