MNIST Digit Classification with Convolutional Neural Network

One of the applications in computer vision is MNIST Digit Classification, and I used multiple methods to achieve the best accuracy on the dataset while optimizing for minimal resource consumption.

Sequential CNN Architecture

The CNN architecture follows a progressive feature extraction approach:

First Convolutional Block utilizes 32 filters with a 3×3 kernel and same padding to extract low-level features such as edges and corners. The MaxPooling layer creates spatial hierarchy while Dropout (0.3) provides initial regularization against overfitting.

Second Convolutional Block deepens the network with 64 filters for medium-level feature extraction. BatchNormalization stabilizes training and accelerates convergence, followed by ReLU activation for non-linear transformation. Dropout rate increases to 0.4 to handle the growing complexity.

Third Convolutional Block employs 128 filters for high-level feature detection. MaxPooling with stride 2 reduces dimensionality efficiently, while Dropout (0.5) provides strong regularization before the classification stage.

The architecture concludes with a Flatten layer converting 3D feature maps to 1D vectors, feeding into a Dense layer with softmax activation for 10-class classification.

With this progressive filter increase for a hierarchialfeature learning I obtainted the following results:

Depthwise Separable Convolution

Unlike the standard method (the previous one) that perform both spatial filtering and channel combination simultaneously, Separable Convolution splits the procces into 2 different parts: the depthwise convolution and the pointwise convolution. This method provides better accuracy while maintaining the same resource consumption. The depthwise conv applies a single filter on each input channel capturing the details independently while the pointwise convolution combines this features. This decomposition reduces the parameters count making the model more efficient and less prone to overfitting.

Mobile Net V2

Using MobileNetV2 for the MNIST dataset is an excessive approach that leads to suboptimal accuracy for several reasons. MobileNetV2 has a lot of parameters, making it overly complex for classifying simple 28x28 pixel handwritten digits. Such a deep architecture is unnecessary given the simplicity of MNIST and often results in overfitting. This observation can be seen in the accuracy score obtainted.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
Screenshot 2025-09-27 at 08.29.08.png		Screenshot 2025-09-27 at 08.29.08.png
Screenshot 2025-09-27 at 18.51.35.png		Screenshot 2025-09-27 at 18.51.35.png
Screenshot 2025-09-27 at 19.46.10.png		Screenshot 2025-09-27 at 19.46.10.png
num_class_mobileNet.py		num_class_mobileNet.py
number_class_sep.py		number_class_sep.py
number_classification1.py		number_classification1.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MNIST Digit Classification with Convolutional Neural Network

Sequential CNN Architecture

Depthwise Separable Convolution

Mobile Net V2

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

MNIST Digit Classification with Convolutional Neural Network

Sequential CNN Architecture

Depthwise Separable Convolution

Mobile Net V2

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages