QuantDiff: Efficient Mixed-Precision Quantization for Stable Diffusion

A framework for mixed-precision quantization that automatically finds optimal bit-width allocation to minimize computational cost while preserving image quality.

Overview

Uniform quantization severely degrades image quality, while manual mixed-precision configuration is infeasible for models with hundreds of layers. QuantDiff automates this process using a three-phase approach:

FLOPs Analysis - Profile layer-wise computational costs (BOPs)
Sensitivity Analysis - Quantify layer sensitivity using FID and CLIP metrics
Optimization - Allocate bits via a greedy Bang-for-Buck algorithm

Key features:

Automated mixed-precision quantization for Stable Diffusion U-Net
Budget-constrained optimization (control cost vs. quality tradeoff)
Support for 4/8/16-bit precision levels
Comprehensive sensitivity scoring (FID + CLIP metrics)

Paper

The research paper describing this methodology is available at: https://github.com/federicobrancasi/quantdiff-paper

Installation

# Clone the repository
git clone https://github.com/federicobrancasi/quantdiff.git
cd quantdiff

# Install dependencies
pip install -r requirements.txt

Quick Start

Phase 1: Analyze Computational Costs

python main.py --eval_flops --device cuda

Phase 2: Measure Layer Sensitivity

python main.py --eval_sensitivity --device cuda --num_prompts 100

Phase 3: Generate Optimal Mixed-Precision Config

python main.py --optimize_mixed_precision --budget_multiplier 0.5

Documentation

See the docs/ folder for detailed documentation:

Requirements

Python 3.8+
PyTorch 2.0+
CUDA GPU recommended (also supports CPU)

Citation

If you use this software in your research, please cite:

@software{brancasi2026quantdiff,
  title={QuantDiff: Efficient Mixed-Precision Quantization for Stable Diffusion via Sensitivity-Driven Optimization},
  author={Brancasi, Federico and Pierini, Maurizio and Segal, Shai and Janco, Roy and Klempner, Anat and Radiano, Eyal},
  year={2026},
  url={https://github.com/federicobrancasi/quantdiff}
}

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Contributing

Contributions are welcome. Please see CONTRIBUTING.md for guidelines.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
analysis		analysis
core		core
docs		docs
prompts		prompts
utils		utils
.gitignore		.gitignore
CITATION.cff		CITATION.cff
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
evaluate_experiment.py		evaluate_experiment.py
generate_experiment.py		generate_experiment.py
main.py		main.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

QuantDiff: Efficient Mixed-Precision Quantization for Stable Diffusion

Overview

Paper

Installation

Quick Start

Phase 1: Analyze Computational Costs

Phase 2: Measure Layer Sensitivity

Phase 3: Generate Optimal Mixed-Precision Config

Documentation

Requirements

Citation

License

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

QuantDiff: Efficient Mixed-Precision Quantization for Stable Diffusion

Overview

Paper

Installation

Quick Start

Phase 1: Analyze Computational Costs

Phase 2: Measure Layer Sensitivity

Phase 3: Generate Optimal Mixed-Precision Config

Documentation

Requirements

Citation

License

Contributing

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages