Skip to content

Implementation of a generative model using Variational Auto-Decoders trained on Fashion MNIST. Includes latent space learning, t-SNE visualization, and interpolation to demonstrate class separation and generative capabilities.

License

Notifications You must be signed in to change notification settings

sagarmandiya/Variational-AutoDecoder

Repository files navigation

Variational Auto-Decoder for Fashion MNIST

This project implements and experiments with Variational Auto-Decoders (VAD) and related architectures for generative modeling.
The goal is to explore how VADs can learn expressive latent spaces and generate realistic samples, as well as visualize and interpolate between them.

Overview

Variational Auto-Decoders are a variant of Variational Autoencoders (VAEs) that remove the encoder network entirely.
Instead, they directly optimize learnable latent variables for each training sample, combined with a decoder network.
This approach can produce an organized latent space while reducing the overhead of training an encoder.

In this project:

  • An Auto-Decoder (AD) is first implemented to reconstruct images from the Fashion MNIST dataset.
  • The model is extended into a Variational Auto-Decoder (VAD) to encourage a well-structured latent space using KL-divergence regularization.
  • Multiple latent distributions (Gaussian, Laplace, Uniform) are tested.
  • t-SNE visualizations are used to analyze latent space structure.
  • Interpolation experiments are conducted to demonstrate smooth transitions between classes.

Features

  • Auto-Decoder Implementation — learns a latent vector per sample, optimized jointly with the decoder.
  • Variational Auto-Decoder Implementation — latent variables modeled as parameterized distributions.
  • Support for Multiple Priors — Gaussian, Laplace, and Uniform distributions using the reparameterization trick.
  • t-SNE Visualization — shows how latent space structure improves with variational training.
  • Latent Interpolation — generates hybrid images between two distinct samples.
  • Evaluation Pipeline — measures reconstruction quality on train/test sets.

Dataset

The project uses a 1,000-sample subset of the Fashion MNIST dataset, containing 28×28 grayscale images of clothing items (e.g., shirts, shoes, bags).

Project Structure

├── constants.py
├── dataset
│ ├── fashion-mnist_train.csv
│ ├── fashion-mnist_test.csv
├── evaluate.py
├── main.py
├── training.py
├── Utils
│ ├── TrainingUtils.py
│ └── utils.py
├── VariationalAutoDecoder.py
└── README.md


⚙️ Installation

  1. Clone the repository:
git clone https://github.com/yourusername/vad-class-separation.git  
cd VARIATIONAL_AUTODECODER
  1. Install dependencies::
pip install -r requirements.txt
  1. Prepare dataset::
    Place fashion-mnist_train.csv and fashion-mnist_test.csv inside the dataset/ folder.

🖥️ Usage

Train the model:

python main.py --data_path dataset --epochs 200

t-SNE Visualization:
After training, the following plots will be generated:

  • tsne_train_plot.png
  • tsne_test_plot.png

🧠 Model Details

  • Latent Size: Configurable (default: 32)
  • Hidden Layers: Fully-connected with LayerNorm, LeakyReLU, Dropout
  • Loss Function:
    • Reconstruction Loss (MSE)
    • KL Divergence (weighted by β)
    • Class Separation Loss

Example hyperparameters:

LATENT_SIZE = 32
HIDDEN_SIZE = 512
LEARNING_RATE = 1e-4
BATCH_SIZE = 128
BETA = 0.01
CLASS_SEP_WEIGHT = 1.0

📊 Results

  • Train t-SNE: Well-clustered, distinct class groupings.
  • Test t-SNE: Improved separation compared to standard VAE baselines.
  • Latent Interpolation: Produces smooth transitions between classes.

t-SNE plot for guassian distribution

t-SNE plot for laplace distribution


🏆 Applications

  • Latent space analysis
  • Representation learning research
  • Generative model exploration
  • Academic projects or portfolio demonstrations

📜 License

This project is licensed under the MIT License — see the LICENSE file for details.

About

Implementation of a generative model using Variational Auto-Decoders trained on Fashion MNIST. Includes latent space learning, t-SNE visualization, and interpolation to demonstrate class separation and generative capabilities.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages