Add TinyViT and MobileViT by runwangdl · Pull Request #3 · pulp-platform/Onnx4Deeploy

runwangdl · 2026-02-11T17:57:39Z

Added

MobileViT model support (XXS, XS, S variants) - hybrid CNN-Transformer for mobile/edge
TinyViT model support - efficient hierarchical vision transformer
Prompt for adding a model

Changed

Optimized MobileViT ONNX export: replaced dynamic view(-1,...) with static reshape(batch_size,...)
Fixed dimension propagation in transformer blocks to eliminate Shape/Gather nodes

Copilot

Pull request overview

Adds TinyViT and MobileViT model support to the Onnx4Deeploy exporters, focusing on cleaner ONNX graphs by removing dynamic-shape operations and propagating fixed dimensions through transformer components.

Changes:

Added TinyViT exporter + CLI registration and updated project exports (__init__.py).
Refactored TinyViT and MobileViT PyTorch implementations to use fixed-dimension reshapes and ONNX-friendly attention blocks.
Updated changelog and Mamba export documentation references.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 10 comments.

Show a summary per file

File	Description
`onnx4deeploy/models/tinyvit_exporter.py`	Adjusts TinyViT defaults and passes fixed `batch_size` into model creation.
`onnx4deeploy/models/pytorch_models/tinyvit/tinyvit.py`	Introduces fixed-dimension TinyViT implementation details (CLS token handling, fixed attention reshapes).
`onnx4deeploy/models/pytorch_models/mobilevit/mobilevit.py`	Reworks MobileViT blocks/attention to avoid dynamic shapes and uses fixed patch dimensions for reshapes.
`onnx4deeploy/models/mobilevit_exporter.py`	Passes fixed `batch_size`/`img_size` to MobileViT variant constructors.
`onnx4deeploy/models/__init__.py`	Exposes `TinyViTExporter` from the models package.
`docs/MAMBA_CLEAN_EXPORT.md`	Updates Mamba custom-op documentation to point at the current implementation file.
`Onnx4Deeploy.py`	Registers TinyViT variants in the CLI model list and usage examples.
`CHANGELOG.md`	Documents newly added models and ONNX export optimizations.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-11T18:05:08Z

onnx4deeploy/models/pytorch_models/tinyvit/tinyvit.py

+        self.cls_selector = nn.Parameter(torch.zeros(1, num_patches + 1), requires_grad=False)
+        self.cls_selector.data[0, 0] = 1.0  # Select only the first token (CLS)


cls_selector is stored as an nn.Parameter and initialized via .data[...] = 1.0. Since it’s a fixed constant used only for selection, it should be a registered buffer (so it’s not treated like a weight), and avoid .data assignment (use a with torch.no_grad(): ... assignment or construct the tensor with the 1.0 already set). This reduces state_dict noise and avoids unsafe .data usage.

Suggested change

self.cls_selector = nn.Parameter(torch.zeros(1, num_patches + 1), requires_grad=False)

self.cls_selector.data[0, 0] = 1.0 # Select only the first token (CLS)

cls_selector = torch.zeros(1, num_patches + 1)

cls_selector[0, 0] = 1.0 # Select only the first token (CLS)

self.register_buffer("cls_selector", cls_selector)

Copilot · 2026-02-11T18:05:08Z

onnx4deeploy/models/pytorch_models/tinyvit/tinyvit.py

@@ -265,19 +378,21 @@ def tiny_vit_11m(num_classes: int = 1000, img_size: int = 224) -> TinyViT:
        img_size=img_size,
        patch_size=16,
        embed_dim=256,
-        depth=12,
+        depth=1,  # Reduced from 12 for faster testing
        num_heads=4,
        num_classes=num_classes,
+        batch_size=batch_size,
    )


tiny_vit_11m() hardcodes depth=1 but is labeled as the ~11M-parameter variant. This no longer corresponds to the intended model configuration. Restore the correct depth for the 11M variant (or make depth configurable and adjust naming/docs accordingly).

Copilot · 2026-02-11T18:05:09Z

onnx4deeploy/models/pytorch_models/mobilevit/mobilevit.py

+        # Compute spatial dimensions at each stage (for MobileViT blocks)
+        # After stem (stride=2): H/2, W/2 = 128x128
+        # After mv2_2 (stride=2): H/4, W/4 = 64x64
+        # After mv2_4 (stride=2): H/8, W/8 = 32x32  <- MobileViT block 1
+        # After mv2_6 (stride=2): H/16, W/16 = 16x16 <- MobileViT block 2
+        # After mv2_8 (stride=2): H/32, W/32 = 8x8   <- MobileViT block 3
+        self.mvit_patch_dims = [
+            (self.image_h // 8, self.image_w // 8),  # MobileViT block 1: 32x32
+            (self.image_h // 16, self.image_w // 16),  # MobileViT block 2: 16x16
+            (self.image_h // 32, self.image_w // 32),  # MobileViT block 3: 8x8
+        ]


mvit_patch_dims uses integer division (H//8, H//16, H//32) to derive the fixed spatial sizes used later for reshapes. If image_size is not divisible by 32 (or differs from the assumed downsampling pattern), this will silently compute incorrect patch sizes and cause reshape errors or incorrect exports. Consider validating image_size at init time (e.g., assert H % 32 == 0 and W % 32 == 0, and optionally that H,W are positive) since these are static deployment constraints anyway.

Copilot · 2026-02-11T18:05:09Z

onnx4deeploy/models/pytorch_models/mobilevit/mobilevit.py

    def forward(self, x: torch.Tensor) -> torch.Tensor:
-        """Forward pass."""
+        """
+        Forward pass with detailed dimension tracking.
+
+        Args:
+            x: Input tensor [B, 3, 256, 256]
+
+        Returns:
+            Output logits [B, num_classes]


The forward docstring hardcodes the input as [B, 3, 256, 256], but the constructor accepts image_size and the exporter passes a configurable img_size. To avoid documentation drift, update the docstring to reflect a generic fixed input size (e.g., [B, 3, H, W] with H/W from image_size).

Copilot · 2026-02-11T18:05:09Z

onnx4deeploy/models/pytorch_models/mobilevit/mobilevit.py

    def __init__(
        self,
+        batch_size: int = 1,
        image_size: Tuple[int, int] = (256, 256),
        num_classes: int = 1000,
-        dims: list = [64, 80, 96],
+        dims: list = [96, 120, 144],
        channels: list = [16, 32, 48, 48, 64, 64, 80, 80, 96, 96, 384],
    ):


dims and channels use mutable list defaults in the MobileViT constructor. Even if they aren’t mutated today, this is a common Python footgun because the same list instance is shared across calls. Prefer using tuples, or defaulting to None and creating the lists inside __init__.

Copilot · 2026-02-11T18:05:09Z

Onnx4Deeploy.py

+        "TinyViT-5M": {
+            "class": TinyViTExporter,
+            "description": "TinyViT-5M (Compact Vision Transformer, ~5M params)",
+            "input_shape": "(B, 3, 64, 64)",
+            "classes": 10,
+            "config": {"variant": "tiny_vit_5m", "img_size": 64, "num_classes": 10},
+        },
+        "TinyViT-11M": {
+            "class": TinyViTExporter,
+            "description": "TinyViT-11M (Compact Vision Transformer, ~11M params)",
+            "input_shape": "(B, 3, 64, 64)",
+            "classes": 10,
+            "config": {"variant": "tiny_vit_11m", "img_size": 64, "num_classes": 10},
+        },
+        "TinyViT-21M": {
+            "class": TinyViTExporter,
+            "description": "TinyViT-21M (Compact Vision Transformer, ~21M params)",
+            "input_shape": "(B, 3, 64, 64)",
+            "classes": 10,
+            "config": {"variant": "tiny_vit_21m", "img_size": 64, "num_classes": 10},
+        },


The TinyViT-* entries rely on the per-model "config" overrides (variant/img_size/num_classes), but the current generate_model() flow updates exporter.config and then calls exporter.export_inference(), which reloads config via load_config() and discards those overrides. As a result, selecting "TinyViT-11M" / "TinyViT-21M" (and similarly MobileViT-XXS/-S) will still export the exporter defaults. Consider changing the exporter API to honor a pre-set exporter.config (e.g., skip load_config() when self.config is already set) or add a supported way to pass overrides into export_* so these model presets actually take effect.

Copilot · 2026-02-11T18:05:10Z

onnx4deeploy/models/pytorch_models/tinyvit/tinyvit.py

+        self,
+        dim: int,
+        num_heads: int = 8,
+        qkv_bias: bool = False,


Attention.__init__ now defaults qkv_bias to False, but Block does not pass this flag, so this silently changes the model’s behavior compared to the previous default (and typical TinyViT/ViT implementations). If the bias removal is only for export cleanliness, consider keeping the default True and making the ONNX-friendly choice explicit via a constructor arg passed from the exporter/config.

Suggested change

qkv_bias: bool = False,

qkv_bias: bool = True,

Copilot · 2026-02-11T18:05:10Z

onnx4deeploy/models/pytorch_models/tinyvit/tinyvit.py

@@ -244,19 +355,21 @@ def tiny_vit_5m(num_classes: int = 1000, img_size: int = 224) -> TinyViT:
        img_size=img_size,
        patch_size=16,
        embed_dim=192,
-        depth=12,
+        depth=1,  # Reduced from 12 for faster testing
        num_heads=3,
        num_classes=num_classes,
+        batch_size=batch_size,
    )


tiny_vit_5m() hardcodes depth=1 while the function name/docstring claims this is the ~5M-parameter TinyViT variant. With depth reduced from 12 to 1, the architecture/parameter count no longer matches the advertised model and will likely break expectations for accuracy and benchmarking. Suggest restoring the canonical depth for the variant (or exposing depth as a configurable argument and renaming this helper to avoid implying it matches the published 5M model).

Copilot · 2026-02-11T18:05:10Z

onnx4deeploy/models/pytorch_models/tinyvit/tinyvit.py

@@ -286,7 +401,8 @@ def tiny_vit_21m(num_classes: int = 1000, img_size: int = 224) -> TinyViT:
        img_size=img_size,
        patch_size=16,
        embed_dim=384,
-        depth=12,
+        depth=1,  # Reduced from 12 for faster testing
        num_heads=6,
        num_classes=num_classes,
+        batch_size=batch_size,
    )


tiny_vit_21m() hardcodes depth=1 but is labeled as the ~21M-parameter variant. This helper no longer produces the advertised architecture/parameter count. Restore the intended depth for this variant (or make the simplification explicit via naming/config).

Copilot · 2026-02-11T18:05:11Z

onnx4deeploy/models/__init__.py

@@ -26,4 +27,5 @@
    "MobileViTExporter",
    "MambaExporter",
    "SleepConViTExporter",
+    "TinyViTExporter",


There is existing model-export test coverage under tests/models/ (e.g., test_cct.py) but no analogous tests added for the new/updated MobileViTExporter and the new TinyViTExporter. Adding basic inference export + ONNX Runtime execution tests would help catch shape/variant regressions (especially since these models rely on fixed reshape dimensions).

runwangdl added 3 commits February 11, 2026 17:46

Update Mamba Documentation

fc51afd

Add MobileViT and TinyViT

e283b24

Update Changelog

0ff7da0

runwangdl self-assigned this Feb 11, 2026

runwangdl requested a review from Copilot February 11, 2026 17:59

Copilot started reviewing on behalf of runwangdl February 11, 2026 17:59 View session

Copilot AI reviewed Feb 11, 2026

View reviewed changes

runwangdl added 2 commits February 11, 2026 18:11

Add Prompt for adding a model exporter

05ea016

Correct mobileViT for patchembed

29fe5e7

runwangdl merged commit 0a3f65e into pulp-platform:devel Feb 14, 2026
10 checks passed

runwangdl deleted the VITs branch March 4, 2026 10:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add TinyViT and MobileViT#3

Add TinyViT and MobileViT#3
runwangdl merged 5 commits intopulp-platform:develfrom
runwangdl:VITs

runwangdl commented Feb 11, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 11, 2026

Uh oh!

Copilot AI Feb 11, 2026

Uh oh!

Copilot AI Feb 11, 2026

Uh oh!

Copilot AI Feb 11, 2026

Uh oh!

Copilot AI Feb 11, 2026

Uh oh!

Copilot AI Feb 11, 2026

Uh oh!

Copilot AI Feb 11, 2026

Uh oh!

Copilot AI Feb 11, 2026

Uh oh!

Copilot AI Feb 11, 2026

Uh oh!

Copilot AI Feb 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		self.cls_selector = nn.Parameter(torch.zeros(1, num_patches + 1), requires_grad=False)
		self.cls_selector.data[0, 0] = 1.0 # Select only the first token (CLS)

-        self.cls_selector = nn.Parameter(torch.zeros(1, num_patches + 1), requires_grad=False)
-        self.cls_selector.data[0, 0] = 1.0  # Select only the first token (CLS)
+        cls_selector = torch.zeros(1, num_patches + 1)
+        cls_selector[0, 0] = 1.0  # Select only the first token (CLS)
+        self.register_buffer("cls_selector", cls_selector)

Conversation

runwangdl commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Added

Changed

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

runwangdl commented Feb 11, 2026 •

edited

Loading