YOLOv7-tiny Selective Quantization — Phase 1

Produces a fully INT8 quantized TFLite model ready for
NXP i.MX 8M Plus NPU deployment with TFLite delegate.

Folder Structure

phase1/
├── run_pipeline.py              ← run everything in one command
├── config.yaml                  ← all settings
├── requirements.txt
├── yolov7-tiny.pt               ← PUT THIS HERE (download below)
├── yolov7-main/                 ← PUT THIS HERE (from Nick's files)
├── calibration_images/          ← PUT IMAGES HERE (copy from yolov7-main/inference/images/)
├── scripts/
│   ├── utils.py                 ← shared helpers (auto-used, don't run directly)
│   ├── step1_sensitivity.py
│   ├── step2_selective_ptq.py
│   ├── step3_export_onnx.py
│   ├── step4_export_tflite.py
│   └── step5_benchmark.py
├── results/                     ← auto-created
├── quantized_models/            ← auto-created
└── benchmark_reports/           ← auto-created

Setup — Do This Once

1. Install Python packages

pip install -r requirements.txt

Python 3.9, 3.10, or 3.11 recommended.

2. Place yolov7-main folder

Copy the yolov7-main folder (from Nick's files) into phase1/:

phase1/
└── yolov7-main/
    ├── models/
    ├── utils/
    └── ...

3. Fix one line in yolov7-main (required for PyTorch 2.0+)

Open yolov7-main/models/experimental.py and find line ~252:

ckpt = torch.load(w, map_location=map_location)

Change it to:

ckpt = torch.load(w, map_location=map_location, weights_only=False)

Save and close.

4. Download yolov7-tiny.pt weights

# Option A — command line
wget https://github.com/WongKinYiu/yolov7/releases/download/v0.1/yolov7-tiny.pt

# Option B — browser
# Go to: https://github.com/WongKinYiu/yolov7/releases/tag/v0.1
# Download yolov7-tiny.pt
# Place it in phase1/

5. Add calibration images

# Copy the 6 sample images (enough for Phase 1)
cp yolov7-main/inference/images/* calibration_images/

Run the Pipeline

# Make sure you are in the phase1/ folder
cd phase1

# Run all 5 steps
python run_pipeline.py

# Or run from a specific step (useful if one step fails)
python run_pipeline.py 2    # start from step 2
python run_pipeline.py 4    # start from step 4 (TFLite)

Or run steps individually:

python scripts/step1_sensitivity.py
python scripts/step2_selective_ptq.py
python scripts/step3_export_onnx.py
python scripts/step4_export_tflite.py
python scripts/step5_benchmark.py

What Each Step Does

Step	Script	Time	Output
1	step1_sensitivity.py	~5-10 min	layer_sensitivity_report.csv, selective_quant_plan.csv
2	step2_selective_ptq.py	~1 min	yolov7_tiny_selective_int8.pt, ptq_layer_summary.json
3	step3_export_onnx.py	~1 min	yolov7_tiny_fp32.onnx
4	step4_export_tflite.py	~2-3 min	yolov7_tiny_fp32.tflite, yolov7_tiny_int8.tflite
5	step5_benchmark.py	~2 min	benchmark_report.json, benchmark_report.txt

Final Outputs

results/
├── layer_sensitivity_report.csv   ← every layer scored: keep_fp32 / quantize
├── selective_quant_plan.csv       ← the plan applied in Step 2
├── quantization_summary.json      ← top fragile layers, stats
└── ptq_layer_summary.json         ← which layers were actually quantized

quantized_models/
├── yolov7_tiny_fp32.onnx          ← FP32 ONNX
├── yolov7_tiny_fp32.tflite        ← FP32 TFLite (baseline)
└── yolov7_tiny_int8.tflite        ← INT8 TFLite — deploy this on NXP

benchmark_reports/
├── benchmark_report.json
└── benchmark_report.txt           ← human-readable before/after summary

Reading the Results

layer_sensitivity_report.csv

Column	What it means
`recommendation`	`keep_fp32` = too fragile / `quantize_candidate` = safe to quantize
`proxy_cosine`	How similar output is after fake-quantizing. 1.0 = no change
`proxy_relative_mae`	Output error after quantization. Below 2% = safe
`sensitivity_score`	Higher = more dangerous to quantize

benchmark_report.txt

Size reduction — how much smaller INT8 TFLite is vs FP32
Latency — measured on your CPU (workstation). On NXP NPU will be faster
Output cosine similarity — proxy for output quality. 0.999+ = near-identical
Detection delta — difference in number of detections

If TensorFlow Won't Install (Step 4)

TF is large. If you can't install it locally, run Step 4 on Google Colab:

Upload quantized_models/yolov7_tiny_fp32.onnx and calibration_images/ to Colab
In Colab:

!pip install tensorflow onnx-tf onnx
# upload step4_export_tflite.py and config.yaml
# run it

Download yolov7_tiny_int8.tflite back
Run Step 5 locally

Common Errors

No module named 'models'
→ yolov7_repo_dir in config.yaml is wrong. Must point to the folder containing models/

UnpicklingError or WeightsOnly error on torch.load
→ Apply the weights_only=False fix in Step 3 of Setup above

No images found in calibration_images/
→ Copy images into phase1/calibration_images/

ONNX not found when running step4
→ Run step3 first

qnnpack backend error on Windows
→ Change backend: "qnnpack" to backend: "fbgemm" in config.yaml

Phase 1 — YOLOv7-tiny Selective Quantization

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
calibration_images		calibration_images
scripts		scripts
.gitignore		.gitignore
README.md		README.md
config.yaml		config.yaml
requirements.txt		requirements.txt
run_pipeline.py		run_pipeline.py
test.py		test.py
yolov7-tiny.yaml		yolov7-tiny.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

YOLOv7-tiny Selective Quantization — Phase 1

Folder Structure

Setup — Do This Once

1. Install Python packages

2. Place yolov7-main folder

3. Fix one line in yolov7-main (required for PyTorch 2.0+)

4. Download yolov7-tiny.pt weights

5. Add calibration images

Run the Pipeline

What Each Step Does

Final Outputs

Reading the Results

layer_sensitivity_report.csv

benchmark_report.txt

If TensorFlow Won't Install (Step 4)

Common Errors

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

YOLOv7-tiny Selective Quantization — Phase 1

Folder Structure

Setup — Do This Once

1. Install Python packages

2. Place yolov7-main folder

3. Fix one line in yolov7-main (required for PyTorch 2.0+)

4. Download yolov7-tiny.pt weights

5. Add calibration images

Run the Pipeline

What Each Step Does

Final Outputs

Reading the Results

layer_sensitivity_report.csv

benchmark_report.txt

If TensorFlow Won't Install (Step 4)

Common Errors

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages