Skip to content

federicobrancasi/quantdiff

Repository files navigation

QuantDiff: Efficient Mixed-Precision Quantization for Stable Diffusion

A framework for mixed-precision quantization that automatically finds optimal bit-width allocation to minimize computational cost while preserving image quality.

Overview

Uniform quantization severely degrades image quality, while manual mixed-precision configuration is infeasible for models with hundreds of layers. QuantDiff automates this process using a three-phase approach:

  1. FLOPs Analysis - Profile layer-wise computational costs (BOPs)
  2. Sensitivity Analysis - Quantify layer sensitivity using FID and CLIP metrics
  3. Optimization - Allocate bits via a greedy Bang-for-Buck algorithm

Key features:

  • Automated mixed-precision quantization for Stable Diffusion U-Net
  • Budget-constrained optimization (control cost vs. quality tradeoff)
  • Support for 4/8/16-bit precision levels
  • Comprehensive sensitivity scoring (FID + CLIP metrics)

Paper

The research paper describing this methodology is available at: https://github.com/federicobrancasi/quantdiff-paper

Installation

# Clone the repository
git clone https://github.com/federicobrancasi/quantdiff.git
cd quantdiff

# Install dependencies
pip install -r requirements.txt

Quick Start

Phase 1: Analyze Computational Costs

python main.py --eval_flops --device cuda

Phase 2: Measure Layer Sensitivity

python main.py --eval_sensitivity --device cuda --num_prompts 100

Phase 3: Generate Optimal Mixed-Precision Config

python main.py --optimize_mixed_precision --budget_multiplier 0.5

Documentation

See the docs/ folder for detailed documentation:

Requirements

  • Python 3.8+
  • PyTorch 2.0+
  • CUDA GPU recommended (also supports CPU)

Citation

If you use this software in your research, please cite:

@software{brancasi2026quantdiff,
  title={QuantDiff: Efficient Mixed-Precision Quantization for Stable Diffusion via Sensitivity-Driven Optimization},
  author={Brancasi, Federico and Pierini, Maurizio and Segal, Shai and Janco, Roy and Klempner, Anat and Radiano, Eyal},
  year={2026},
  url={https://github.com/federicobrancasi/quantdiff}
}

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Contributing

Contributions are welcome. Please see CONTRIBUTING.md for guidelines.

About

Automated mixed-precision quantization framework for sensitivity-driven optimization using FID and CLIP metrics with bit allocation.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages