Skip to content

An educational deep learning framework implemented from scratch using NumPy.

License

Notifications You must be signed in to change notification settings

starswaterbrook/toynet

Repository files navigation

ToyNet

Python 3.12+ MIT License Type Checked Ruff codecov

An educational but robust deep learning framework implemented from scratch using NumPy.
Data loading is handled with Pandas.

ToyNet provides a clean, extensible architecture for building and training neural networks without external ML dependencies. Perfect for understanding deep learning mathematical fundamentals or experimenting with custom architectures on small to medium datasets.

Table of Contents

Features

Core Capabilities

  • Pure NumPy implementation - No external ML libraries, transparent mathematical operations
  • Educational focus - Clear, readable code designed for learning and understanding
  • Modular architecture - Easily extensible with custom components
  • Type safety - Full mypy compatibility with comprehensive type hints
  • Comprehensive tests - Unit, integration, and end-to-end tests with high coverage

Data Handling

  • Data loaders - Support for CSV files and in-memory arrays
  • Preprocessing - Programmable feature scaling, train/validation splits in data loaders
  • Batch processing - Efficient mini-batch training with configurable sizes

Neural Network Components

  • Layers: Customizable with activation functions, weight initialization methods
  • Activation and loss functions: Classic activations and losses for various tasks with the protocols to create custom ones
  • Optimizers: Classic GD variants or Adam with adaptive learning rates

Training Features

  • Training policies - Early stopping, learning rate scheduling, model checkpointing, protocol for custom policies
  • Model persistence - Save/load trained models in NumPy format
  • Logging - Comprehensive training progress tracking

Installation

From PyPI (Recommended)

pip install toynet-ml

Quick Start

Simple XOR Problem

import numpy as np
from toynet import MultiLayerPerceptron, Dense, BasicDataLoader
from toynet.functions import ReLU, Sigmoid, BinaryCrossEntropy
from toynet.optimizers import Adam

# Create XOR dataset
X = np.array([[0, 0], [0, 1], [1, 0], [1, 1]]).reshape(4, 1, 2)
y = np.array([[0], [1], [1], [0]]).reshape(4, 1, 1)

# Build network
network = MultiLayerPerceptron(
    layers=[
        Dense(2, 8, ReLU),
        Dense(8, 1, Sigmoid)
    ],
    loss_function=BinaryCrossEntropy,
    optimizer=Adam(learning_rate=0.01)
)

# Train
data_loader = BasicDataLoader(X, y, batch_size=2)
network.train(data_loader, epochs=1000)

# Make batch predictions
predictions = network(X)
predictions_rounded = np.round(predictions, 2)
print(f"Input: \n{X.reshape(4, 2)}")
print(f"Predictions: {predictions_rounded.flatten()}")

Simple Regression Problem: Predict y = 2x + 1

import numpy as np
from toynet import MultiLayerPerceptron, Dense, BasicDataLoader
from toynet.functions import ReLU, Identity, MeanSquaredError
from toynet.optimizers import Adam

# Create regression dataset
X = np.array([[-4], [-3], [-2], [-1], [0], [1], [2], [3], [4], [5], [6]]).reshape(11, 1, 1)
y = np.array([[-7], [-5], [-3], [-1], [1], [3], [5], [7], [9], [11], [13]]).reshape(11, 1, 1)

# Build network
network = MultiLayerPerceptron(
    layers=[
        Dense(1, 8, ReLU),
        Dense(8, 8, ReLU),
        Dense(8, 1, Identity)
    ],
    loss_function=MeanSquaredError,
    optimizer=Adam(learning_rate=0.01)
)

# Train
data_loader = BasicDataLoader(X, y, batch_size=4)
network.train(data_loader, epochs=1000)

# Make unseen data prediction
prediction = network(np.array([9]).reshape(1, 1)) # 2x + 1 = 19
print(f"Predictions: {prediction.flatten()}") 

Architecture Overview

ToyNet follows a modular, object-oriented design:

MultiLayerPerceptron
├── Layers (Dense)
│   ├── Input/Output dimensions
│   ├── Activation Functions (ReLU, Sigmoid, etc.)
│   └── Weight Initializers (He (default), Xavier)
├── Loss Function (CrossEntropy, MSE)
└── Optimizer (GD variants, Adam)

Training
├── Data Loader (Basic, CSV)
├── Fixed epochs
└── Training Policies (EarlyStopping, LR Scheduling etc.)

Comprehensive Example

End to end training on Kaggle Digit Recognizer CSV dataset

import numpy as np

from toynet import Adam, CSVDataloader, Dense, MultiLayerPerceptron
from toynet.functions import (
    CategoricalCrossEntropy,
    ReLU,
    Softmax,
)
from toynet.policies import ReduceLROnPlateau, SaveBestModel, ValidationLossEarlyStop

if __name__ == "__main__":
    data_loader = CSVDataloader(
        "train.csv",
        batch_size=128,
        label_cols=["label"],
        validation_split=0.2,
        transform=lambda X, y: (X / 255.0, np.eye(10)[y.astype(int)]),
    )

    nnet = MultiLayerPerceptron(
        [
            Dense(784, 256, ReLU),
            Dense(256, 128, ReLU),
            Dense(128, 64, ReLU),
            Dense(64, 10, Softmax),
        ],
        loss_function=CategoricalCrossEntropy,
        optimizer=Adam(
            learning_rate=0.01,
        ),
    )

    nnet.train(
        data_loader,
        epochs=250,
        policies=[
            ValidationLossEarlyStop("mnist.npz", patience=6),
            ReduceLROnPlateau(factor=0.1, patience=4, min_lr=1e-6),
            SaveBestModel("mnist_best_checkpoint.npz", save_grace_period=20),
        ],
    )

Benchmarks:

  • Training time: ~40 minutes on modern CPU
  • Best Kaggle test accuracy achieved: 96.6%

For production workloads, use PyTorch or TensorFlow which offer GPU acceleration and distributed training.

Mathematical Foundations

ToyNet implements core neural network mathematics from scratch:

Forward Propagation

h = σ(Wx + b)

Where:

  • W: Weight matrix
  • x: Input vector
  • b: Bias vector
  • σ: Activation function

Backpropagation

Automatic gradient computation using the chain rule:

∂L/∂W = ∂L/∂h × ∂h/∂z × ∂z/∂W

Gradients are computed and stored in layer objects during backpropagation.

Weight Updates

  • GD variants: W ← W - η∇W
  • Adam: Adaptive learning with momentum and RMSprop

Development

Running tests and code checks

After cloning the repository and setting up a virtual environment:

# Install development dependencies
pip install -e ".[dev]"

# Run all tests with coverage
pytest --cov=toynet --cov-report=html

# Type checking
mypy --install-types --config-file pyproject.toml ./src

# Code formatting
ruff format
ruff check --fix

Project Structure

toynet/
├── src/toynet/              # Main package
│   ├── data_loaders/        # Data loading utilities
│   ├── functions/           # Activation and loss functions
│   ├── initializers/        # Weight initialization
│   ├── layers/              # Layer implementations
│   ├── networks/            # Neural network architectures
│   ├── optimizers/          # Gradient descent algorithms
│   ├── policies/            # Training policies
│   └── config.py            # Configuration settings
│
├── tests/                   # Test suite
│   ├── uts/                 # Unit tests
│   ├── integration/         # Integration tests
└───└── e2e/                 # End-to-end tests

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

An educational deep learning framework implemented from scratch using NumPy.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published