A simple PyTorch project that trains a neural network to classify Iris flower species.
- Data normalization (0–1)
- Label encoding
- Neural network from scratch
- Training loop
- Model saving and loading
- Simple prediction usage
- Input features: 4
- Hidden layers: 2 (8 neurons each, ReLU)
- Output classes: 3
- Loss: CrossEntropyLoss
- Optimizer: Adam
Iris flower dataset (150 samples, 3 classes)
Label mapping:
- setosa → 1
- versicolor → 2
- virginica → 0
Input (4)
↓
Linear(4 → 8) + ReLU
↓
Linear(8 → 8) + ReLU
↓
Linear(8 → 3)
next projects:
PROJECT ROADMAP (Learning-Oriented)
-
Wine Quality Prediction Dataset: https://archive.ics.uci.edu/ml/datasets/wine+quality
Type of ML: Supervised Learning → Regression & Classification
What the data is: Chemical measurements of wine (acidity, sugar, pH, alcohol, etc.) with a quality score (0–10) given by human testers.
What I will learn:
- Difference between regression (predict exact quality) and classification (good vs bad wine)
- How numeric features affect predictions
- Loss functions like MSE and MAE
- Why scaling matters for real-world data
Why this project matters: This teaches how ML models learn patterns from measurements, not labels that are obvious to humans.
Complexity: ★★☆☆☆
-
Titanic Survival Prediction Dataset: https://www.kaggle.com/competitions/titanic/data
Type of ML: Supervised Learning → Binary Classification
What the data is: Passenger information (age, sex, class, fare, family size) with a survival label (survived / not survived).
What I will learn:
- Cleaning messy, real-world data
- Handling missing values
- Encoding text data into numbers
- Evaluating models using accuracy & confusion matrix
Why this project matters: This is where ML stops being clean and becomes realistic. Most real datasets look like this.
Complexity: ★★★☆☆
-
MNIST Handwritten Digits Dataset: http://yann.lecun.com/exdb/mnist/
Type of ML: Supervised Learning → Multi-class Classification
What the data is: Images of handwritten digits (0–9), each image is a 28×28 grid of pixel values.
What I will learn:
- How images become tensors
- Batch training and DataLoaders
- Why deeper networks perform better
- Softmax, CrossEntropyLoss, and logits
Why this project matters: This is the gateway to computer vision and deep learning.
Complexity: ★★★★☆
BONUS (When I’m confident)
-
Customer Segmentation Dataset: https://www.kaggle.com/datasets/vjchoudhary7/customer-segmentation-tutorial
Type of ML: Unsupervised Learning → Clustering
What the data is: Customer behavior data (age, income, spending score), with NO labels.
What I will learn:
- How models find patterns without answers
- Clustering (KMeans)
- Feature scaling importance
- Interpreting results without accuracy scores
Why this project matters: This is how ML discovers structure instead of being told.
Complexity: ★★★☆☆