An educational but robust deep learning framework implemented from scratch using NumPy.
Data loading is handled with Pandas.
ToyNet provides a clean, extensible architecture for building and training neural networks without external ML dependencies. Perfect for understanding deep learning mathematical fundamentals or experimenting with custom architectures on small to medium datasets.
- Features
- Installation
- Quick Start
- Architecture Overview
- Comprehensive Example
- Mathematical Foundations
- Development
- License
- Pure NumPy implementation - No external ML libraries, transparent mathematical operations
- Educational focus - Clear, readable code designed for learning and understanding
- Modular architecture - Easily extensible with custom components
- Type safety - Full mypy compatibility with comprehensive type hints
- Comprehensive tests - Unit, integration, and end-to-end tests with high coverage
- Data loaders - Support for CSV files and in-memory arrays
- Preprocessing - Programmable feature scaling, train/validation splits in data loaders
- Batch processing - Efficient mini-batch training with configurable sizes
- Layers: Customizable with activation functions, weight initialization methods
- Activation and loss functions: Classic activations and losses for various tasks with the protocols to create custom ones
- Optimizers: Classic GD variants or Adam with adaptive learning rates
- Training policies - Early stopping, learning rate scheduling, model checkpointing, protocol for custom policies
- Model persistence - Save/load trained models in NumPy format
- Logging - Comprehensive training progress tracking
pip install toynet-mlimport numpy as np
from toynet import MultiLayerPerceptron, Dense, BasicDataLoader
from toynet.functions import ReLU, Sigmoid, BinaryCrossEntropy
from toynet.optimizers import Adam
# Create XOR dataset
X = np.array([[0, 0], [0, 1], [1, 0], [1, 1]]).reshape(4, 1, 2)
y = np.array([[0], [1], [1], [0]]).reshape(4, 1, 1)
# Build network
network = MultiLayerPerceptron(
layers=[
Dense(2, 8, ReLU),
Dense(8, 1, Sigmoid)
],
loss_function=BinaryCrossEntropy,
optimizer=Adam(learning_rate=0.01)
)
# Train
data_loader = BasicDataLoader(X, y, batch_size=2)
network.train(data_loader, epochs=1000)
# Make batch predictions
predictions = network(X)
predictions_rounded = np.round(predictions, 2)
print(f"Input: \n{X.reshape(4, 2)}")
print(f"Predictions: {predictions_rounded.flatten()}")import numpy as np
from toynet import MultiLayerPerceptron, Dense, BasicDataLoader
from toynet.functions import ReLU, Identity, MeanSquaredError
from toynet.optimizers import Adam
# Create regression dataset
X = np.array([[-4], [-3], [-2], [-1], [0], [1], [2], [3], [4], [5], [6]]).reshape(11, 1, 1)
y = np.array([[-7], [-5], [-3], [-1], [1], [3], [5], [7], [9], [11], [13]]).reshape(11, 1, 1)
# Build network
network = MultiLayerPerceptron(
layers=[
Dense(1, 8, ReLU),
Dense(8, 8, ReLU),
Dense(8, 1, Identity)
],
loss_function=MeanSquaredError,
optimizer=Adam(learning_rate=0.01)
)
# Train
data_loader = BasicDataLoader(X, y, batch_size=4)
network.train(data_loader, epochs=1000)
# Make unseen data prediction
prediction = network(np.array([9]).reshape(1, 1)) # 2x + 1 = 19
print(f"Predictions: {prediction.flatten()}") ToyNet follows a modular, object-oriented design:
MultiLayerPerceptron
├── Layers (Dense)
│ ├── Input/Output dimensions
│ ├── Activation Functions (ReLU, Sigmoid, etc.)
│ └── Weight Initializers (He (default), Xavier)
├── Loss Function (CrossEntropy, MSE)
└── Optimizer (GD variants, Adam)
Training
├── Data Loader (Basic, CSV)
├── Fixed epochs
└── Training Policies (EarlyStopping, LR Scheduling etc.)
import numpy as np
from toynet import Adam, CSVDataloader, Dense, MultiLayerPerceptron
from toynet.functions import (
CategoricalCrossEntropy,
ReLU,
Softmax,
)
from toynet.policies import ReduceLROnPlateau, SaveBestModel, ValidationLossEarlyStop
if __name__ == "__main__":
data_loader = CSVDataloader(
"train.csv",
batch_size=128,
label_cols=["label"],
validation_split=0.2,
transform=lambda X, y: (X / 255.0, np.eye(10)[y.astype(int)]),
)
nnet = MultiLayerPerceptron(
[
Dense(784, 256, ReLU),
Dense(256, 128, ReLU),
Dense(128, 64, ReLU),
Dense(64, 10, Softmax),
],
loss_function=CategoricalCrossEntropy,
optimizer=Adam(
learning_rate=0.01,
),
)
nnet.train(
data_loader,
epochs=250,
policies=[
ValidationLossEarlyStop("mnist.npz", patience=6),
ReduceLROnPlateau(factor=0.1, patience=4, min_lr=1e-6),
SaveBestModel("mnist_best_checkpoint.npz", save_grace_period=20),
],
)- Training time: ~40 minutes on modern CPU
- Best Kaggle test accuracy achieved: 96.6%
For production workloads, use PyTorch or TensorFlow which offer GPU acceleration and distributed training.
ToyNet implements core neural network mathematics from scratch:
h = σ(Wx + b)
Where:
W: Weight matrixx: Input vectorb: Bias vectorσ: Activation function
Automatic gradient computation using the chain rule:
∂L/∂W = ∂L/∂h × ∂h/∂z × ∂z/∂W
Gradients are computed and stored in layer objects during backpropagation.
- GD variants:
W ← W - η∇W - Adam: Adaptive learning with momentum and RMSprop
After cloning the repository and setting up a virtual environment:
# Install development dependencies
pip install -e ".[dev]"
# Run all tests with coverage
pytest --cov=toynet --cov-report=html
# Type checking
mypy --install-types --config-file pyproject.toml ./src
# Code formatting
ruff format
ruff check --fixtoynet/
├── src/toynet/ # Main package
│ ├── data_loaders/ # Data loading utilities
│ ├── functions/ # Activation and loss functions
│ ├── initializers/ # Weight initialization
│ ├── layers/ # Layer implementations
│ ├── networks/ # Neural network architectures
│ ├── optimizers/ # Gradient descent algorithms
│ ├── policies/ # Training policies
│ └── config.py # Configuration settings
│
├── tests/ # Test suite
│ ├── uts/ # Unit tests
│ ├── integration/ # Integration tests
└───└── e2e/ # End-to-end tests
This project is licensed under the MIT License - see the LICENSE file for details.