Skip to content

[quantization] Introduce wrapper for Qwen3VLForConditionalGeneration#605

Draft
dvsav wants to merge 1 commit intoSamsung:mainfrom
dvsav:quant_for_conditional_generation
Draft

[quantization] Introduce wrapper for Qwen3VLForConditionalGeneration#605
dvsav wants to merge 1 commit intoSamsung:mainfrom
dvsav:quant_for_conditional_generation

Conversation

@dvsav
Copy link
Copy Markdown
Contributor

@dvsav dvsav commented Apr 2, 2026

This change introduces QuantQwen3VLForConditionalGeneration wrapper to support post-training quantization of Qwen3VLForConditionalGeneration module.

Why?

Qwen3VLForConditionalGeneration is an essential part of Qwen model.
Trying to quantize Qwen3VLForConditionalGeneration via PTQ generates exception PTQQuantizer: no quantization wrapper for Qwen3VLForConditionalGeneration.

What

This change introduces:

  • Class QuantQwen3VLForConditionalGeneration (tico/quantization/wrapq/wrappers/qwen_vl/quant_for_conditional_generation.py).
  • Unit tests: class TestQuantQwen3VLForConditionalGeneration (test/quantization/wrapq/wrappers/qwen_vl/test_quant_for_conditional_generation.py).
  • New entry in _CORE_MODULES (tico/quantization/wrapq/wrappers/registry.py).
  • Example of Qwen3VLForConditionalGeneration quantization and conversion to Circle (tico/quantization/wrapq/examples/qwen/quantize_for_conditional_generation.py).

Unit Tests

Unit tests results with coverage information:

$ coverage run -m pytest test/quantization/wrapq/wrappers/qwen_vl/test_quant_for_conditional_generation.py -v
================================================================== test session starts ==================================================================
platform linux -- Python 3.10.12, pytest-8.4.0, pluggy-1.6.0 -- /home/d.savchenkov/myenv/bin/python3
cachedir: .pytest_cache
rootdir: /home/d.savchenkov/TICO
configfile: pyproject.toml
plugins: anyio-4.12.0, mock-3.15.1, xdist-3.7.0, cov-6.2.1
collected 7 items                                                                                                                                       

test/quantization/wrapq/wrappers/qwen_vl/test_quant_for_conditional_generation.py::TestQuantQwen3VLForConditionalGeneration::test_forward_text_only                   PASSED [ 14%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_for_conditional_generation.py::TestQuantQwen3VLForConditionalGeneration::test_forward_with_both_images_and_videos PASSED [ 28%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_for_conditional_generation.py::TestQuantQwen3VLForConditionalGeneration::test_forward_with_images                 PASSED [ 42%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_for_conditional_generation.py::TestQuantQwen3VLForConditionalGeneration::test_forward_with_videos                 PASSED [ 57%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_for_conditional_generation.py::TestQuantQwen3VLForConditionalGeneration::test_mode_transitions                    PASSED [ 71%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_for_conditional_generation.py::TestQuantQwen3VLForConditionalGeneration::test_registration_in_registry            PASSED [ 85%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_for_conditional_generation.py::TestQuantQwen3VLForConditionalGeneration::test_wraps_submodules                    PASSED [100%]

============================================================= 7 passed, 2 warnings in 8.48s =============================================================

Coverage info (irrelevant files skipped):

$ coverage report -m
Name                                                                           Stmts   Miss  Cover   Missing
------------------------------------------------------------------------------------------------------------
...
tico/quantization/wrapq/wrappers/nn/quant_linear.py                               29      0   100%
...
tico/quantization/wrapq/wrappers/qwen_vl/quant_for_conditional_generation.py      23      0   100%
tico/quantization/wrapq/wrappers/qwen_vl/quant_model.py                          215     52    76%   114, 120, 163, 199, 277-281, 348-363, 427-436, 499-576, 620-625
tico/quantization/wrapq/wrappers/qwen_vl/quant_text_attn.py                      136      5    96%   196-197, 201-203
tico/quantization/wrapq/wrappers/qwen_vl/quant_text_decoder_layer.py              42      0   100%
tico/quantization/wrapq/wrappers/qwen_vl/quant_text_mlp.py                        43      0   100%
tico/quantization/wrapq/wrappers/qwen_vl/quant_text_model.py                     130      8    94%   248, 254-256, 260, 278, 282, 285-286
tico/quantization/wrapq/wrappers/qwen_vl/quant_vision_attn.py                    105      0   100%
tico/quantization/wrapq/wrappers/qwen_vl/quant_vision_block.py                    42      0   100%
tico/quantization/wrapq/wrappers/qwen_vl/quant_vision_mlp.py                      33      0   100%
tico/quantization/wrapq/wrappers/qwen_vl/quant_vision_model.py                   173      6    97%   166, 173, 180, 195, 279, 452
tico/quantization/wrapq/wrappers/qwen_vl/quant_vision_patch_embed.py              25      0   100%
tico/quantization/wrapq/wrappers/qwen_vl/quant_vision_patch_merger.py             36      0   100%
tico/quantization/wrapq/wrappers/registry.py                                      36      1    97%   260
...
------------------------------------------------------------------------------------------------------------
TOTAL                                                                          11720   6838    42%

Script for testing quantization and conversion to Circle

$ python tico/quantization/wrapq/examples/qwen/quantize_for_conditional_generation.py
┌───────────── Quantization Error Summary ─────────────
│ Mean |diff|: 0.022036
│ PEIR       : 16.040346 %
└──────────────────────────────────────────────────────
     ┌───────────────────────────────────────────┐
 0.72┤                                           │
     │                                    •••••  │
 0.48┤                                 •••••••   │
     │                              •••••••      │
     │                          •••••••••        │
 0.24┤                     • ••••••••••          │
     │                     ••••••••••            │
-0.00┤                  •••••••••••              │
     │             •••••••••••• •                │
     │            ••••••••••  •                  │
-0.24┤          •••••••••••                      │
     │      • ••••••••                           │
-0.48┤      ••••••••                             │
     │    ••••••                                 │
     │  ••••                                     │
-0.72┤                                           │
     └┬──────────┬─────────┬──────────┬─────────┬┘
    -0.72      -0.36     -0.00      0.36     0.72 

[QuantCheck] WARNING: 34 nodes without qparam detected (see logs).
Circle model saved as 'qwen3vl_for_conditional_generation.q.circle'

This change introduces QuantQwen3VLForConditionalGeneration wrapper to support post-training quantization of Qwen3VLForConditionalGeneration operation.

TICO-DCO-1.0-Signed-off-by: d.savchenkov <d.savchenkov@partner.samsung.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant