Unlock the power of deep learning for image classification with this **Handwritten Digit Recognition** project! This Jupyter Notebook demonstrates how to build, train, and evaluate a **Convolutional Neural Network (CNN)** using TensorFlow/Keras to accurately classify handwritten digits from the famous MNIST dataset. It covers data preparation, CNN architecture design, model training, and making predictions on new images. Perfect for understanding the fundamentals of CNNs for computer vision tasks! 🚀
Table of Contents
This project focuses on building and training a **Convolutional Neural Network (CNN)** for the task of handwritten digit recognition. [cite: uploaded:Handwritten_Digit_Recognition.ipynb] CNNs are particularly well-suited for image-based tasks due to their ability to automatically learn hierarchical features from raw pixel data. This notebook provides a practical guide to:
- Data Loading & Preprocessing: Loading the MNIST dataset, reshaping images for CNN input, and normalizing pixel values.
- CNN Model Definition: Constructing a sequential CNN model with `Conv2D`, `MaxPool2D`, `Flatten`, and `Dense` layers using Keras.
- Model Compilation & Training: Configuring the optimizer, loss function, and metrics, then training the model on the dataset.
- Prediction on New Images: Demonstrating how to load an external image, preprocess it, and use the trained model to predict the digit.
This project exclusively uses the MNIST (Modified National Institute of Standards and Technology) dataset. [cite: uploaded:Handwritten_Digit_Recognition.ipynb] It is a classic dataset in machine learning and computer vision, consisting of:
- Training Set: 60,000 examples of handwritten digits.
- Test Set: 10,000 examples for evaluating model performance.
Each example is a 28x28 pixel grayscale image, associated with a label from 0 to 9, representing the digit depicted. The dataset is directly available through `tensorflow.keras.datasets.mnist.load_data()`, making it easy to access and use.
The Convolutional Neural Network (CNN) designed for this task is a sequential model. [cite: uploaded:Handwritten_Digit_Recognition.ipynb] Its architecture is structured to effectively extract features from image data:
- Convolutional Layer (`Conv2D`): Applies filters to extract features (e.g., edges, textures). Uses `relu` activation.
- Pooling Layer (`MaxPool2D`): Reduces the spatial dimensions of the feature maps, helping to make the model more robust to small shifts in input.
- Flatten Layer: Converts the 2D feature maps into a 1D vector to be fed into dense layers.
- Dense Hidden Layer: A fully connected layer for learning complex patterns from the flattened features. Uses `relu` activation.
- Output Layer (`Dense`): The final fully connected layer with `softmax` activation, outputting probabilities for each of the 10 digit classes (0-9).
- Optimizer: Adam (Adaptive Moment Estimation) - an efficient stochastic optimization algorithm.
- Loss Function: Sparse Categorical Crossentropy - suitable for integer-encoded labels in multi-class classification.
- Metrics: Accuracy - measures the proportion of correctly classified images.
- 🚀 End-to-End Recognition: Covers data loading, CNN model building, training, and prediction. [cite: uploaded:Handwritten_Digit_Recognition.ipynb]
- 🔍 Image Preprocessing: Demonstrates necessary steps like reshaping and normalization for image data. [cite: uploaded:Handwritten_Digit_Recognition.ipynb]
- 📈 Epoch Experimentation: Includes a section to demonstrate the impact of increasing the number of training epochs on model performance. [cite: uploaded:Handwritten_Digit_Recognition.ipynb]
- 🖼️ External Image Prediction: Shows how to load and predict on a custom image file, making the model practical. [cite: uploaded:Handwritten_Digit_Recognition.ipynb]
To run this project, ensure you have the following installed:
- Python 3.x
- Jupyter Notebook (or JupyterLab, Google Colab)
- Required Libraries:
(Note: `keras` is part of `tensorflow` in newer versions)
pip install tensorflow numpy matplotlib
-
Download the Notebook:
Download
Handwritten_Digit_Recognition.ipynbfrom this repository.Alternatively, open it directly in Google Colab for a zero-setup experience.
-
Prepare External Image (Optional):
If you want to test with your own handwritten digit image, save it (e.g., as `download.png` or `Sample_Image.jpg`) in the same directory as the notebook, or update the `image_path` variable in the notebook accordingly.
-
Install Dependencies:
pip install tensorflow numpy matplotlib -
Run the Notebook:
Open
Handwritten_Digit_Recognition.ipynbin Jupyter or Colab.Execute each cell sequentially to train the CNN and make predictions!
The notebook will display training progress and prediction results.
Epoch 1/10
1875/1875 ━━━━━━━━━━━━━━━━━━━━ 36s 18ms/step - accuracy: 0.9094 - loss: 0.3103
Epoch 2/10
1875/1875 ━━━━━━━━━━━━━━━━━━━━ 38s 17ms/step - accuracy: 0.9832 - loss: 0.0548
...
Epoch 10/10
1875/1875 ━━━━━━━━━━━━━━━━━━━━ 33s 18ms/step - accuracy: 0.9981 - loss: 0.00531/1 ━━━━━━━━━━━━━━━━━━━━ 0s 97ms/step
Predicted class: 3This shows the predicted digit for an input image.
Key parts of the notebook's code structure:
from tensorflow.keras.datasets import mnist import numpy as np# Loading data (X_train,y_train),(X_test,y_test) = mnist.load_data()
# Reshaping data for CNN input (add channel dimension) X_train=X_train.reshape((X_train.shape[0],X_train.shape[1],X_train.shape[2],1)) X_test=X_test.reshape((X_test.shape[0],X_test.shape[1],X_test.shape[2],1))
# Normalizing the pixel values to [0, 1] X_train=X_train/255.0 X_test=X_test/255.0
from tensorflow.keras.models import Sequential from tensorflow.keras.layers import Dense, Flatten, Conv2D, MaxPool2Dmodel=Sequential()
# Adding convolution layer model.add(Conv2D(32,(3,3),activation='relu',input_shape=(28,28,1)))
# Adding pooling layer model.add(MaxPool2D(2,2))
# Adding fully connected layers (after flattening) model.add(Flatten()) model.add(Dense(100,activation='relu'))
# Adding output layer model.add(Dense(10,activation='softmax'))
# Compiling the model model.compile(loss='sparse_categorical_crossentropy',optimizer='adam',metrics=['accuracy'])
# Fitting the model model.fit(X_train,y_train,epochs=10) # Or 20, as shown in the notebook for increased epochsfrom tensorflow.keras.utils import load_img,img_to_array
# Load and preprocess an external image for prediction image_path='/content/download.png' # Update with your image path image=load_img(image_path,target_size=(28,28),color_mode='grayscale') image_array=img_to_array(image) image_array=image_array/255.0 image_array=image_array.reshape((1,28,28,1))
# Make prediction prediction=model.predict(image_array) predicted_class=np.argmax(prediction) print('Predicted class:',predicted_class)
Here are some ways to extend and experiment with this project:
- 🎨 Different CNN Architectures: Try adding more `Conv2D` and `MaxPool2D` layers, or experiment with different filter sizes and numbers.
- 🧪 Different Optimizers/Activations: Experiment with other optimizers (e.g., SGD, RMSprop) or activation functions (e.g., Leaky ReLU, ELU) in the hidden layers.
- 📊 Performance Metrics: Calculate and display additional evaluation metrics like precision, recall, F1-score, or a classification report.
- 🖼️ Data Augmentation: Implement simple data augmentation techniques (e.g., rotation, shifting, zooming) using `ImageDataGenerator` to improve model generalization.
Contributions are always welcome! If you have ideas for improvements, new features, or just want to fix a bug, please feel free to open an issue or submit a pull request. Let’s make handwritten digit recognition even better! 🌟
Star this repo if you find it helpful! ⭐
Created with 💖 by the Chirag