Handwritten character recognition is a typical task of image classification, which is to assigning an input image one label from a fixed set of categories. It enables the ability of a computer to interpret handwritings from images, photographs, paper texts, or pdfs. The computer finds the most plausible words for a given text with unclear words. This project developed a toolbox to recognize an English character from an image.
In this repository, I am using Keras to train a model in handritting recognition from a training sample of around 40,000 characters of the english alphabet, located in train.csv within seperate folder data. After which, the model is saved and loaded to test its accuracy against 10,000 characters from test.csv, located within the same folder data. Although I tried different approaches to increase the training and validation accuracy in the training stage, final accuracy of the testing data plateus at around 6 to 7 percentage points, seen in the table below. The main functions can be found in folder main.
- Python3
- numpy
- matplotlib
- scikit-image
- keras
- tensorflow (for keras machine learning)
- skLearn (For datasets)
- Run
python3 predict.py [image_path], to get the prediction result from a given image path. - Run
python3 training.py, to train a new model. Training data specification is mentioned API reference. - Run
python3 testing.py, to test the current model. Test images should be in csv format in seperate folderdata.
The provided model is trained and tested using the data from a Kaggle competition. The accuracy of the model according to different parameter settings is shown in the table below:
| # Model | Model fit type | Epoch | Batch size | Testing accuray |
|---|---|---|---|---|
| larger_CNN (convolution2D:30x3x3, 50x3x3) | Normal | 100 | 60 | 93.33% |
| larger_CNN (convolution2D:30x3x3, 50x3x3) | Generator | 50 | 60 | 92.84% |
| larger_CNN (convolution2D:30x3x3, 50x3x3) | Generator | 50 | 60 | 92.66% |
| larger CNN (convolution2D:30x5x5, 30X3X3) | Normal | 100 | 64 | 91.98% |
| simple CNN (convolution2D:32X5X5) | Normal | 100 | 64 | 90.74% |
| Baseline (2 layers) | Normal | 100 | 20 | 86.46% |
| Baseline (2 layers) | Normal | 200 | 64 | 85.03% |
Some real prediction examples is shown below:
: "s" (conf. = 99.83%),
: "f" (conf. = 52.98%),
: "i" (conf. = 100.00%).
datasets.py: Data should be in seperate folder 'data', in csv format. (look in sample.csv)models.py: Different models to train with, along with method to save, load, and call models.training.py: Trains the model (which you select), and saves it in seperate folder 'data'. There is a choice to remove validation images from training images. There is choice to use normal fit or fit generator.testing.py: Trains the model from file 'test.csv' in folder entitled 'data'.predict.py: Predicts the alphabet in the image the user passes to it.utils.py: Hosts miscellaneous programs, for keeping time, converting answers, or getting image samples etc.
A machine undergoing supervised learning has 'correct' outputs for each input, and is trained to generate the specific output to a minimal degree of error. On the other hand, a machine undergoing unsupervised learning does not have correct outputs for the inputs. The machine is thus trained to generate plausible outputs for each individual output.