microsoft · zhenchaoni · Jun 26, 2026 · Jun 26, 2026 · Jun 26, 2026
@@ -21,8 +21,8 @@ $ winml quantize [options]
 |---|---|---|---|---|
 | `--model` | `-m` | path | *(required)* | Input ONNX model file. |
 | `--output` | `-o` | path | `{input}_qdq.onnx` | Output path for the quantized model. |
-| `--task` | | string | — | Task name (e.g., `image-classification`, `text-classification`) used to select a task-appropriate calibration dataset. Pair with `--model-name` so the dataset is preprocessed exactly the way the model expects. Without `--task`, calibration falls back to synthetic random data. |
-| `--model-name` | | string | — | HuggingFace model ID (e.g., `microsoft/resnet-50`) used to load the matching preprocessor/tokenizer for calibration. Only used when `--task` is provided. |
+| `--task` | | string | — | Task name (e.g., `image-classification`, `text-classification`) used to select a task-appropriate calibration dataset. Pair with `--model-id` so the dataset is preprocessed exactly the way the model expects. Without `--task`, calibration falls back to synthetic random data. |
+| `--model-id` | | string | — | HuggingFace model ID (e.g., `microsoft/resnet-50`) used to load the matching preprocessor/tokenizer for calibration. Only used when `--task` is provided. |
 | `--precision` | `-p` | string | `None` | Precision shorthand: `int8`, `int16`, or mixed-precision like `w8a16`. Overridden by explicit `--weight-type` / `--activation-type`. |
 | `--samples` | | integer | `10` | Number of calibration samples used to compute quantization ranges. |
 | `--method` | | choice | `minmax` | Calibration algorithm: `minmax`, `entropy`, or `percentile`. |
@@ -44,14 +44,14 @@ Precision can be set at a coarse level with `--precision` or tuned per tensor
 type with `--weight-type` and `--activation-type`; explicit type flags always
 override `--precision`.
 
-Calibration data is selected from `--task` and `--model-name`. For a supported
+Calibration data is selected from `--task` and `--model-id`. For a supported
 task, a built-in default calibration dataset is loaded and preprocessed through
 the model's own tokenizer or image processor, so the calibration tensors match
 what the model will see at inference time. For an unsupported task — or when
 `--task` is omitted entirely — calibration falls back to synthetic random data
 synthesized from the ONNX input specification. Random-data calibration is fast
 and always works, but the resulting scales are typically less accurate than
-dataset-driven calibration, so always provide `--task` and `--model-name` when
+dataset-driven calibration, so always provide `--task` and `--model-id` when
 the model task is supported.
 
 ## Examples
@@ -79,7 +79,7 @@ Total time: 4.31s
 
 ```bash
 # Task-aware calibration: real samples preprocessed through the model's own image processor
-winml quantize -m resnet50.onnx --task image-classification --model-name microsoft/resnet-50 --samples 128
+winml quantize -m resnet50.onnx --task image-classification --model-id microsoft/resnet-50 --samples 128
 ```
 
 ```bash
@@ -104,7 +104,7 @@ winml quantize -m bert-base-uncased.onnx --precision int16
 
 ## Common pitfalls
 
-- **Calibration uses synthetic random data by default.** Without `--task` and `--model-name`, scales and zero-points are computed from random tensors synthesized from the ONNX input specification — the model never sees realistic activations, so accuracy after quantization can degrade noticeably. Always pass `--task` and `--model-name` for supported tasks (e.g., `--task image-classification --model-name microsoft/resnet-50`) so calibration runs on real samples preprocessed through the model's own tokenizer or image processor.
+- **Calibration uses synthetic random data by default.** Without `--task` and `--model-id`, scales and zero-points are computed from random tensors synthesized from the ONNX input specification — the model never sees realistic activations, so accuracy after quantization can degrade noticeably. Always pass `--task` and `--model-id` for supported tasks (e.g., `--task image-classification --model-id microsoft/resnet-50`) so calibration runs on real samples preprocessed through the model's own tokenizer or image processor.
 - **`--weight-type` / `--activation-type` silently override `--precision`.** If you pass both, the explicit type flags win. Omit `--precision` when setting types explicitly to avoid confusion.
 - **Low sample counts can hurt accuracy.** The default of 10 samples is sufficient for quick testing, but production models typically need 64–256 representative samples for good calibration.
 - **`--per-channel` increases model size.** Per-channel quantization stores a separate scale and zero-point per output channel; this can noticeably inflate the model file size compared to per-tensor mode.

@@ -98,7 +98,7 @@ Set to `null` to skip quantization.
 | `per_channel` | `bool` | `false` | Per-channel quantization. |
 | `symmetric` | `bool` | `false` | Symmetric quantization. |
 | `task` | `str \| null` | `null` | Task for dataset-aware calibration. |
-| `model_name` | `str \| null` | `null` | Model ID for calibration dataset resolution. |
+| `model_id` | `str \| null` | `null` | Model ID for calibration dataset resolution. |
 | `dataset_name` | `str \| null` | `null` | Override calibration dataset. |
 | `distribution` | `str` | `"uniform"` | Random distribution for dummy data. |
 | `seed` | `int \| null` | `null` | Random seed for reproducibility. |

@@ -39,7 +39,7 @@ This writes a `WinMLBuildConfig` JSON file to `bert_config.json`. The file captu
     "samples": 10,
     "calibration_method": "minmax",
     "task": "text-classification",
-    "model_name": "bert-base-uncased"
+    "model_id": "bert-base-uncased"
     ... // truncated: per_channel, symmetric, distribution, ...
   },
   "compile": null

@@ -72,7 +72,7 @@
     "op_types_to_quantize": null,
     "nodes_to_exclude": null,
     "task": "sentence-similarity",
-    "model_name": "BAAI/bge-large-en-v1.5"
+    "model_id": "BAAI/bge-large-en-v1.5"
   },
   "loader": {
     "task": "sentence-similarity",

@@ -60,7 +60,7 @@
     "op_types_to_quantize": null,
     "nodes_to_exclude": null,
     "task": "feature-extraction",
-    "model_name": "BAAI/bge-m3"
+    "model_id": "BAAI/bge-m3"
   },
   "compile": null,
   "loader": {

@@ -60,7 +60,7 @@
     "op_types_to_quantize": null,
     "nodes_to_exclude": null,
     "task": "sentence-similarity",
-    "model_name": "BAAI/bge-m3"
+    "model_id": "BAAI/bge-m3"
   },
   "compile": null,
   "loader": {

@@ -60,7 +60,7 @@
     "op_types_to_quantize": null,
     "nodes_to_exclude": null,
     "task": "fill-mask",
-    "model_name": "FacebookAI/roberta-base"
+    "model_id": "FacebookAI/roberta-base"
   },
   "loader": {
     "task": "fill-mask",

@@ -60,7 +60,7 @@
     "op_types_to_quantize": null,
     "nodes_to_exclude": null,
     "task": "fill-mask",
-    "model_name": "FacebookAI/roberta-large"
+    "model_id": "FacebookAI/roberta-large"
   },
   "compile": null,
   "loader": {

@@ -60,7 +60,7 @@
     "op_types_to_quantize": null,
     "nodes_to_exclude": null,
     "task": "fill-mask",
-    "model_name": "FacebookAI/xlm-roberta-base"
+    "model_id": "FacebookAI/xlm-roberta-base"
   },
   "loader": {
     "task": "fill-mask",

@@ -58,7 +58,7 @@
     "op_types_to_quantize": null,
     "nodes_to_exclude": null,
     "task": "token-classification",
-    "model_name": "Isotonic/distilbert_finetuned_ai4privacy_v2"
+    "model_id": "Isotonic/distilbert_finetuned_ai4privacy_v2"
   },
   "compile": null,
   "loader": {

@@ -48,7 +48,7 @@
     "op_types_to_quantize": null,
     "nodes_to_exclude": null,
     "task": "image-feature-extraction",
-    "model_name": "StanfordAIMI/dinov2-base-xray-224"
+    "model_id": "StanfordAIMI/dinov2-base-xray-224"
   },
   "compile": null,
   "loader": {

@@ -73,7 +73,7 @@
     "op_types_to_quantize": null,
     "nodes_to_exclude": null,
     "task": "question-answering",
-    "model_name": "ahotrod/electra_large_discriminator_squad2_512"
+    "model_id": "ahotrod/electra_large_discriminator_squad2_512"
   },
   "compile": null,
   "loader": {

@@ -48,7 +48,7 @@
     "op_types_to_quantize": null,
     "nodes_to_exclude": null,
     "task": "image-classification",
-    "model_name": "apple/mobilevit-small"
+    "model_id": "apple/mobilevit-small"
   },
   "compile": null,
   "loader": {

@@ -60,7 +60,7 @@
     "op_types_to_quantize": null,
     "nodes_to_exclude": null,
     "task": "text-classification",
-    "model_name": "cardiffnlp/twitter-roberta-base-sentiment-latest"
+    "model_id": "cardiffnlp/twitter-roberta-base-sentiment-latest"
   },
   "loader": {
     "task": "text-classification",

@@ -72,7 +72,7 @@
     "op_types_to_quantize": null,
     "nodes_to_exclude": null,
     "task": "token-classification",
-    "model_name": "dbmdz/bert-large-cased-finetuned-conll03-english"
+    "model_id": "dbmdz/bert-large-cased-finetuned-conll03-english"
   },
   "compile": null,
   "loader": {

@@ -75,7 +75,7 @@
     "op_types_to_quantize": null,
     "nodes_to_exclude": null,
     "task": "question-answering",
-    "model_name": "deepset/bert-large-uncased-whole-word-masking-squad2"
+    "model_id": "deepset/bert-large-uncased-whole-word-masking-squad2"
   },
   "loader": {
     "task": "question-answering",

@@ -63,7 +63,7 @@
     "op_types_to_quantize": null,
     "nodes_to_exclude": null,
     "task": "question-answering",
-    "model_name": "deepset/roberta-base-squad2"
+    "model_id": "deepset/roberta-base-squad2"
   },
   "loader": {
     "task": "question-answering",

@@ -63,7 +63,7 @@
     "op_types_to_quantize": null,
     "nodes_to_exclude": null,
     "task": "question-answering",
-    "model_name": "deepset/tinyroberta-squad2"
+    "model_id": "deepset/tinyroberta-squad2"
   },
   "loader": {
     "task": "question-answering",

@@ -48,7 +48,7 @@
     "op_types_to_quantize": null,
     "nodes_to_exclude": null,
     "task": "image-classification",
-    "model_name": "dima806/fairface_age_image_detection"
+    "model_id": "dima806/fairface_age_image_detection"
   },
   "compile": null,
   "loader": {

@@ -61,7 +61,7 @@
     "op_types_to_quantize": null,
     "nodes_to_exclude": null,
     "task": "question-answering",
-    "model_name": "distilbert/distilbert-base-cased-distilled-squad"
+    "model_id": "distilbert/distilbert-base-cased-distilled-squad"
   },
   "compile": null,
   "loader": {

@@ -61,7 +61,7 @@
     "op_types_to_quantize": null,
     "nodes_to_exclude": null,
     "task": "question-answering",
-    "model_name": "distilbert/distilbert-base-uncased-distilled-squad"
+    "model_id": "distilbert/distilbert-base-uncased-distilled-squad"
   },
   "compile": null,
   "loader": {

@@ -58,7 +58,7 @@
     "op_types_to_quantize": null,
     "nodes_to_exclude": null,
     "task": "text-classification",
-    "model_name": "distilbert/distilbert-base-uncased-finetuned-sst-2-english"
+    "model_id": "distilbert/distilbert-base-uncased-finetuned-sst-2-english"
   },
   "compile": null,
   "loader": {

@@ -58,7 +58,7 @@
     "op_types_to_quantize": null,
     "nodes_to_exclude": null,
     "task": "fill-mask",
-    "model_name": "distilbert/distilbert-base-uncased"
+    "model_id": "distilbert/distilbert-base-uncased"
   },
   "compile": null,
   "loader": {

@@ -48,7 +48,7 @@
     "op_types_to_quantize": null,
     "nodes_to_exclude": null,
     "task": "image-feature-extraction",
-    "model_name": "facebook/dino-vitb16"
+    "model_id": "facebook/dino-vitb16"
   },
   "loader": {
     "task": "image-feature-extraction",

@@ -48,7 +48,7 @@
     "op_types_to_quantize": null,
     "nodes_to_exclude": null,
     "task": "image-feature-extraction",
-    "model_name": "facebook/dino-vits16"
+    "model_id": "facebook/dino-vits16"
   },
   "loader": {
     "task": "image-feature-extraction",

@@ -48,7 +48,7 @@
     "op_types_to_quantize": null,
     "nodes_to_exclude": null,
     "task": "image-feature-extraction",
-    "model_name": "facebook/dinov2-base"
+    "model_id": "facebook/dinov2-base"
   },
   "loader": {
     "task": "image-feature-extraction",

@@ -48,7 +48,7 @@
     "op_types_to_quantize": null,
     "nodes_to_exclude": null,
     "task": "image-feature-extraction",
-    "model_name": "facebook/dinov2-large"
+    "model_id": "facebook/dinov2-large"
   },
   "loader": {
     "task": "image-feature-extraction",

@@ -48,7 +48,7 @@
     "op_types_to_quantize": null,
     "nodes_to_exclude": null,
     "task": "image-feature-extraction",
-    "model_name": "facebook/dinov2-small"
+    "model_id": "facebook/dinov2-small"
   },
   "loader": {
     "task": "image-feature-extraction",

@@ -72,7 +72,7 @@
     "op_types_to_quantize": null,
     "nodes_to_exclude": null,
     "task": "fill-mask",
-    "model_name": "google-bert/bert-base-multilingual-cased"
+    "model_id": "google-bert/bert-base-multilingual-cased"
   },
   "compile": null,
   "loader": {

@@ -72,7 +72,7 @@
     "op_types_to_quantize": null,
     "nodes_to_exclude": null,
     "task": "fill-mask",
-    "model_name": "google-bert/bert-base-multilingual-uncased"
+    "model_id": "google-bert/bert-base-multilingual-uncased"
   },
   "loader": {
     "task": "fill-mask",

@@ -72,7 +72,7 @@
     "op_types_to_quantize": null,
     "nodes_to_exclude": null,
     "task": "fill-mask",
-    "model_name": "google-bert/bert-base-uncased"
+    "model_id": "google-bert/bert-base-uncased"
   },
   "loader": {
     "task": "fill-mask",

@@ -75,7 +75,7 @@
     "op_types_to_quantize": null,
     "nodes_to_exclude": null,
     "task": "question-answering",
-    "model_name": "google-bert/bert-large-uncased-whole-word-masking-finetuned-squad"
+    "model_id": "google-bert/bert-large-uncased-whole-word-masking-finetuned-squad"
   },
   "compile": null,
   "loader": {

@@ -48,7 +48,7 @@
     "op_types_to_quantize": null,
     "nodes_to_exclude": null,
     "task": "image-feature-extraction",
-    "model_name": "google/vit-base-patch16-224-in21k"
+    "model_id": "google/vit-base-patch16-224-in21k"
   },
   "loader": {
     "task": "image-feature-extraction",

@@ -48,7 +48,7 @@
     "op_types_to_quantize": null,
     "nodes_to_exclude": null,
     "task": "image-classification",
-    "model_name": "google/vit-base-patch16-224"
+    "model_id": "google/vit-base-patch16-224"
   },
   "compile": null,
   "loader": {

@@ -60,7 +60,7 @@
     "op_types_to_quantize": null,
     "nodes_to_exclude": null,
     "task": "zero-shot-classification",
-    "model_name": "joeddav/xlm-roberta-large-xnli"
+    "model_id": "joeddav/xlm-roberta-large-xnli"
   },
   "compile": null,
   "loader": {

@@ -66,7 +66,7 @@
     "op_types_to_quantize": null,
     "nodes_to_exclude": null,
     "task": "feature-extraction",
-    "model_name": "laion/CLIP-ViT-B-32-laion2B-s34B-b79K"
+    "model_id": "laion/CLIP-ViT-B-32-laion2B-s34B-b79K"
   },
   "loader": {
     "task": "feature-extraction",

@@ -48,7 +48,7 @@
     "op_types_to_quantize": null,
     "nodes_to_exclude": null,
     "task": "image-segmentation",
-    "model_name": "mattmdjaga/segformer_b2_clothes"
+    "model_id": "mattmdjaga/segformer_b2_clothes"
   },
   "compile": null,
   "loader": {

@@ -48,7 +48,7 @@
     "op_types_to_quantize": null,
     "nodes_to_exclude": null,
     "task": "image-feature-extraction",
-    "model_name": "microsoft/rad-dino"
+    "model_id": "microsoft/rad-dino"
   },
   "loader": {
     "task": "image-feature-extraction",

@@ -48,7 +48,7 @@
     "op_types_to_quantize": null,
     "nodes_to_exclude": null,
     "task": "image-classification",
-    "model_name": "microsoft/resnet-18"
+    "model_id": "microsoft/resnet-18"
   },
   "compile": null,
   "loader": {

@@ -48,7 +48,7 @@
     "op_types_to_quantize": null,
     "nodes_to_exclude": null,
     "task": "image-classification",
-    "model_name": "microsoft/resnet-50"
+    "model_id": "microsoft/resnet-50"
   },
   "compile": null,
   "loader": {