A Machine Learning project which implements Supervised Machine Learning Technique.
The aim of this project is to classify or is to identify the handwritten digits taken from the MNIST dataset using the KNN Algorithm.
Python, Numpy, Pandas, Matplotlib
- KNN stands for the Kth Nearest neighbour, it is one of the simplest Machine Learning Algorithm based on Supervised Learning technique.
- KNN algorithm stores all the available data and classifies a new data point based on the similarity hence its is a non-parametric algorithm.
- This means when new data appears then it can be easily classified into a well suite category by using K- NN algorithm.
- K is the hyperparameter which defines the number of nearest neighbours taken under consideration for building the model.
- The Python libraries used are: NumPy, Panda and MatplotLib.
- Data is imported from the MNIST dataset and since we are implementing using the Supervised Learning technique, we give our machine learning model a collection of training data (features) and what we expect the model to output for each set of that training data we pass in (labels).
- The data is divided into training and testing data, the former dataset is used for training and building the model using KNN Algorithm, the latter is used to check the working and accuracy of the built model.
- Various functions are defined for cleaning and training of the data using a few built-in functionalities.
The KNN algorithm used for building this model can be used for:
-
Building Recommendation Systems: KNN is an excellent baseline approach for the systems. Many companies make a personalized recommendation for its consumers, such as Netflix, Amazon, YouTube, and many more.
-
Searching semantically similar documents: Each document is considered as a vector. If documents are close to each other, that means the documents contain identical topics.
-
Detecting outliers: Credit card fraud detection on internet
This project is basic implementation of Supervised machine learning using the KNN Algorithm, the model is built for classifying the handwritten digits from the MNIST dataset. The KNN algorithm can further be extended to build models based on supervised machine learning
.png)