Phase 2: Sensitivity Analysis

Measure which layers are most sensitive to quantization using COCO prompts and image quality metrics (FID and CLIP).

Usage

# Full analysis (all 314 layers)
python main.py --eval_sensitivity --device cuda --sensitivity_bits 4

# Custom configuration
python main.py --eval_sensitivity \
    --device cuda \
    --sensitivity_bits 8 \
    --num_prompts 100 \
    --batch_size 50

# Analyze subset of layers (for testing)
python main.py --eval_sensitivity --device cuda --max_layers 10

CLI Options

Flag	Type	Default	Description
`--max_layers`	int	`None` (all)	Maximum layers to analyze
`--sensitivity_bits`	int	`4`	Quantization bits for testing (4, 8, or 16)
`--num_prompts`	int	`100`	Number of COCO prompts to use
`--coco_path`	str	`None`	Path to COCO prompts file
`--prompt_seed`	int	`42`	Random seed for prompt shuffling
`--batch_size`	int	`1`	Batch size for generation

Output

Sensitivity Analysis Summary
============================================================
Model: CompVis/stable-diffusion-v1-4
Quantization: 4-bit
Test prompts: 5
Layers analyzed: 314

Top 10 Most Sensitive Layers:
Rank  Layer Name                                       Sensitivity
-------------------------------------------------------------------
1     conv_in                                          0.9845
2     time_embedding.linear_2                          0.8723
3     down_blocks.0.attentions.0.transformer_blocks... 0.7654
...

Result File: results/sensitivity_analysis_<model>_<bits>bit.json

Understanding Sensitivity Scores

Calculation Method

Min-max normalization with weighted metrics:

FID (60%): Distribution similarity (higher FID = more degradation)
CLIP (40%): Semantic consistency (higher degradation = worse)

Score Range

[0.0, 1.0] after normalization:

0.0 - 0.2: Low sensitivity (safe for 4-bit quantization)
0.2 - 0.5: Medium sensitivity (consider 8-bit)
0.5 - 0.8: High sensitivity (keep at 8-bit or 16-bit)
0.8 - 1.0: Critical sensitivity (keep at 16-bit)

Typical Patterns

Input/output layers: Highly sensitive (conv_in, conv_out)
Time embeddings: Critical for diffusion quality
Attention layers: Variable sensitivity (some robust, others critical)
Residual blocks: Generally moderate sensitivity

Interpretation

Higher sensitivity = needs higher precision: Keep these layers at 8-bit or 16-bit
Lower sensitivity = can use lower precision: Safe to quantize to 4-bit
Use for mixed-precision planning: Allocate bits wisely in Phase 3

Next Steps

Review the sensitivity ranking to identify critical layers
Use these results as input for mixed-precision optimization (Phase 3)
Consider different bit widths to find the sweet spot for your use case

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Phase 2: Sensitivity Analysis

Usage

CLI Options

Output

Understanding Sensitivity Scores

Calculation Method

Score Range

Typical Patterns

Interpretation

Next Steps

FilesExpand file tree

phase2-sensitivity.md

Latest commit

History

phase2-sensitivity.md

File metadata and controls

Phase 2: Sensitivity Analysis

Usage

CLI Options

Output

Understanding Sensitivity Scores

Calculation Method

Score Range

Typical Patterns

Interpretation

Next Steps