EMNIST Character Recognition

This repository demonstrates the use of the UbiquitousNeuralNetworks Java library to train and deploy Multilayer Perceptron (MLP) models for recognizing handwritten digits and letters from the EMNIST dataset.

🧠 Overview

This project includes:

Programs to train MLP models using EMNIST (digits 🔢 or letters 🔠);
Interactive programs that let users draw a character ✏️, and the trained MLP recognizes it 🤔💭.

📦 Requirements and dependencies

Java 11+
UbiquitousNeuralNetworks library, via maven dependency.
OpenCSV library, via maven dependency.
EMNIST dataset (optional)
- The dataset should be downloaded if you wish to train your own models.
- You should extract the digit and letters datasets (csv files) into the proper folder structure, presented below.

🧩 Project Structure

├── dataset/                            # EMNIST dataset files folder
│   ├── digits/                         
│   │   ├─ emnist-digits-train.csv      # EMNIST digits train dataset (optional)
│   │   └─ emnist-digits-test.csv       # EMNIST digits test dataset (optional)
│   └── letters/                        
│       ├─ emnist-letters-train.csv     # EMNIST letters train dataset (optional)
│       └─ emnist-letters-test.csv      # EMNIST letters test dataset (optional)
├── src/                                # Source code (Java)
│   └── ...                             # Packages and programs
├── models/                             # Pre-trained models (JSON)
└── ...                                 # Other project files

✏️ Running the Recognition Programs

The repository already provides pre-trained models, so you can run the recognition programs straight away! ⚡

Run the DigitsRecognizer or LettersRecognizer program;
Draw a character and check the model response.
- For each digit/letter you draw the recognition result will be displayed in the console, as depicted below.

📈 Performance of pre-trained models

The performance of the provided pre-trained models (against the EMNIST test datasets) are:

Dataset	Accuracy
EMNIST Digits	98,37%
EMNIST Letters	85,31%

These models were obtained with the provided training programs.

🏋️‍♂️ Training a Model

📃 Obtaining and converting datasets

You'll need to download the EMNIST dataset from Kaggle.
- ⚠️ This is a ~ 1.2GB zip archive.
Extract the relevant files into the project structure, as depicted in the previous section.
We need to convert the csv dataset files into the format used by the UbiquitousNeuralNetworks library - more information about this format can be found in the wiki.
- Just run the EMNISTConverter program in the dataset package.
- This will result in the creation of corresponding .data files; you can delete the .csv files afterwards, if you wish.
You can inspect the datasets with the DatasetInspector program in the dataset package..

🚀 Define the network structure and train a model

Once you have the datasets, you can train and test your own models 😊.

You should check the DigitsModelCreate and LettersModelCreate example programs for a full example (with testing).

An example of a minimum working code would be the following:

Dataset trainSet = new Dataset("dataset/digits/emnist-digits-train.data");
DatasetNormalization normalization = new MinMaxNormalization(trainSet);
normalization.normalize(trainSet);

MLPNetwork network = new MLPNetwork.Builder()
  .addInputLayer(trainSet.inputDimensionality())
  .addHiddenLayer(48, ReLUActivation.class, 0.1)
  .addHiddenLayer(16, ReLUActivation.class, 0.1)
  .addOutputLayer(trainSet.outputDimensionality(), SoftmaxActivation.class, 0)
  .withWeightInitializer(new HeInitializer())
  .build();

Backpropagation backpropagation = new Backpropagation.Builder(trainSet, network)
  .withLearningRate(0.001)
  .withBiasUpdate(true)
  .forNumberEpochs(20)
  .withLossFunction(CrossEntropyLoss.class)
  .build();

backpropagation.trainNetwork();

💾 Model Persistence

Models can be saved to and loaded from JSON files:

MLPNetwork model = ...;

// After training
MLPNetwork.saveJSON(model, "models/my_model.json");

// Later
MLPNetwork loadedModel = MLPNetwork.loadJSON("models/my_model.json");

This enables quick reuse of previously trained models.

📜 License

This project is released under the MIT License. See the LICENSE file for details.

🤵 Authors

Original author: Bruno Silva - (GitHub page) | (Personal page) | (🇵🇹 CIÊNCIA VITAE)

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
assets		assets
models		models
src/main/java/com/brunomnsilva/unn_emnist		src/main/java/com/brunomnsilva/unn_emnist
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EMNIST Character Recognition

🧠 Overview

📦 Requirements and dependencies

🧩 Project Structure

✏️ Running the Recognition Programs

📈 Performance of pre-trained models

🏋️‍♂️ Training a Model

📃 Obtaining and converting datasets

🚀 Define the network structure and train a model

💾 Model Persistence

📜 License

🤵 Authors

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

EMNIST Character Recognition

🧠 Overview

📦 Requirements and dependencies

🧩 Project Structure

✏️ Running the Recognition Programs

📈 Performance of pre-trained models

🏋️‍♂️ Training a Model

📃 Obtaining and converting datasets

🚀 Define the network structure and train a model

💾 Model Persistence

📜 License

🤵 Authors

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages