Skip to content

Conditional Variational Autoencoder for generating 64×64 human faces with controllable attributes such as gender, glasses, and beard, trained on the CelebA dataset.

Notifications You must be signed in to change notification settings

AngeloMolinario/cvae-face-attribute-generation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CVAE - Attribute Conditioned Face Generation

This repository contains a PyTorch implementation of a Conditional Variational Autoencoder (CVAE), trained on the CelebA dataset. The model generates 64×64 human face images with controllable semantic attributes.

🎯 Target Attributes

  • Gender: Male / Female
  • Glasses: Present / Absent
  • Beard: Present / Absent (derived from Goatee and No_Beard)

🧠 Model Architecture

Encoder

Compresses input image x and conditions c into a probabilistic latent space z.

  • Input: RGB image + condition maps (spatially broadcasted)
  • Backbone: Residual downsampling blocks (Conv → ReLU → MaxPool → Conv)
  • Output: μ and log(σ²) vectors parameterizing the latent distribution

Decoder (Hierarchical Gated Conditioning)

Injects attributes at multiple spatial resolutions using Conditioning Paths:

  • Conditioning Path: Each attribute uses a dedicated network taking z and c, with gating mechanism: Out = f(z, c) * σ(g(z, c))
  • Hierarchical Injection:
    • Gender: Low resolution (res0 block) – defines overall facial structure
    • Glasses: Medium resolution (res3 block)
    • Beard: High resolution (res4 block) – fine texture details

Upsampling is performed with custom residual UpSamplingBlocks.

📂 Project Structure

checkpoint/          # Model weights (.pt)
images/              # Generated images during training
result/              # Final images
Utils/
  ├── checkpoint_manager.py  # Save/load utilities
  ├── const.py               # Hyperparameters and paths
  └── image_manager.py       # Image grid utilities
net.py               # CVAE architecture
train.py             # Training script
test.py              # Inference and test grid generation
README.md

⚙️ Setup & Installation

Clone the repository and install the following dependencies

pip install torch torchvision matplotlib tqdm pillow

Dataset: Auto-downloads CelebA to ../data (configurable in const.py). Ensure internet and sufficient disk space.

🚀 Usage

Training

python train.py
  • 300 epochs by default
  • Debug images saved to ./images
  • Checkpoints saved to ./checkpoint
  • Modify hyperparameters (BATCH_SIZE, LEARNING_RATE, LATENT_SIZE) in Utils/const.py

Testing & Generation

python test.py
  • Loads the latest checkpoint from ./checkpoint
  • Generates 8×8 grid of all attribute combinations
  • Saves results to images

📊 Loss Function

Optimizes the VAE objective (ELBO):

L = L_recon(x, x_hat) + β * D_KL(q(z|x,c) || p(z))
  • L_recon: MSE between original and reconstructed images
  • D_KL: Kullback-Leibler divergence to regularize latent space
  • β: Weighting factor (default BETA_KL = 1.0)

🖼️ Results

Conditional Generation

Random faces (z ~ N(0, I)) conditioned on 8 attribute combinations. Each row represents a different attribute set.

Result 2

About

Conditional Variational Autoencoder for generating 64×64 human faces with controllable attributes such as gender, glasses, and beard, trained on the CelebA dataset.

Topics

Resources

Stars

Watchers

Forks

Languages