diff --git a/docs/commands/quantize.md b/docs/commands/quantize.md index 51128a046..8364a5574 100644 --- a/docs/commands/quantize.md +++ b/docs/commands/quantize.md @@ -21,8 +21,8 @@ $ winml quantize [options] |---|---|---|---|---| | `--model` | `-m` | path | *(required)* | Input ONNX model file. | | `--output` | `-o` | path | `{input}_qdq.onnx` | Output path for the quantized model. | -| `--task` | | string | — | Task name (e.g., `image-classification`, `text-classification`) used to select a task-appropriate calibration dataset. Pair with `--model-name` so the dataset is preprocessed exactly the way the model expects. Without `--task`, calibration falls back to synthetic random data. | -| `--model-name` | | string | — | HuggingFace model ID (e.g., `microsoft/resnet-50`) used to load the matching preprocessor/tokenizer for calibration. Only used when `--task` is provided. | +| `--task` | | string | — | Task name (e.g., `image-classification`, `text-classification`) used to select a task-appropriate calibration dataset. Pair with `--model-id` so the dataset is preprocessed exactly the way the model expects. Without `--task`, calibration falls back to synthetic random data. | +| `--model-id` | | string | — | HuggingFace model ID (e.g., `microsoft/resnet-50`) used to load the matching preprocessor/tokenizer for calibration. Only used when `--task` is provided. | | `--precision` | `-p` | string | `None` | Precision shorthand: `int8`, `int16`, or mixed-precision like `w8a16`. Overridden by explicit `--weight-type` / `--activation-type`. | | `--samples` | | integer | `10` | Number of calibration samples used to compute quantization ranges. | | `--method` | | choice | `minmax` | Calibration algorithm: `minmax`, `entropy`, or `percentile`. | @@ -44,14 +44,14 @@ Precision can be set at a coarse level with `--precision` or tuned per tensor type with `--weight-type` and `--activation-type`; explicit type flags always override `--precision`. -Calibration data is selected from `--task` and `--model-name`. For a supported +Calibration data is selected from `--task` and `--model-id`. For a supported task, a built-in default calibration dataset is loaded and preprocessed through the model's own tokenizer or image processor, so the calibration tensors match what the model will see at inference time. For an unsupported task — or when `--task` is omitted entirely — calibration falls back to synthetic random data synthesized from the ONNX input specification. Random-data calibration is fast and always works, but the resulting scales are typically less accurate than -dataset-driven calibration, so always provide `--task` and `--model-name` when +dataset-driven calibration, so always provide `--task` and `--model-id` when the model task is supported. ## Examples @@ -79,7 +79,7 @@ Total time: 4.31s ```bash # Task-aware calibration: real samples preprocessed through the model's own image processor -winml quantize -m resnet50.onnx --task image-classification --model-name microsoft/resnet-50 --samples 128 +winml quantize -m resnet50.onnx --task image-classification --model-id microsoft/resnet-50 --samples 128 ``` ```bash @@ -104,7 +104,7 @@ winml quantize -m bert-base-uncased.onnx --precision int16 ## Common pitfalls -- **Calibration uses synthetic random data by default.** Without `--task` and `--model-name`, scales and zero-points are computed from random tensors synthesized from the ONNX input specification — the model never sees realistic activations, so accuracy after quantization can degrade noticeably. Always pass `--task` and `--model-name` for supported tasks (e.g., `--task image-classification --model-name microsoft/resnet-50`) so calibration runs on real samples preprocessed through the model's own tokenizer or image processor. +- **Calibration uses synthetic random data by default.** Without `--task` and `--model-id`, scales and zero-points are computed from random tensors synthesized from the ONNX input specification — the model never sees realistic activations, so accuracy after quantization can degrade noticeably. Always pass `--task` and `--model-id` for supported tasks (e.g., `--task image-classification --model-id microsoft/resnet-50`) so calibration runs on real samples preprocessed through the model's own tokenizer or image processor. - **`--weight-type` / `--activation-type` silently override `--precision`.** If you pass both, the explicit type flags win. Omit `--precision` when setting types explicitly to avoid confusion. - **Low sample counts can hurt accuracy.** The default of 10 samples is sufficient for quick testing, but production models typically need 64–256 representative samples for good calibration. - **`--per-channel` increases model size.** Per-channel quantization stores a separate scale and zero-point per output channel; this can noticeably inflate the model file size compared to per-tensor mode. diff --git a/docs/reference/index.md b/docs/reference/index.md index 3c57085b3..e6742f0b5 100644 --- a/docs/reference/index.md +++ b/docs/reference/index.md @@ -98,7 +98,7 @@ Set to `null` to skip quantization. | `per_channel` | `bool` | `false` | Per-channel quantization. | | `symmetric` | `bool` | `false` | Symmetric quantization. | | `task` | `str \| null` | `null` | Task for dataset-aware calibration. | -| `model_name` | `str \| null` | `null` | Model ID for calibration dataset resolution. | +| `model_id` | `str \| null` | `null` | Model ID for calibration dataset resolution. | | `dataset_name` | `str \| null` | `null` | Override calibration dataset. | | `distribution` | `str` | `"uniform"` | Random distribution for dummy data. | | `seed` | `int \| null` | `null` | Random seed for reproducibility. | diff --git a/docs/samples/bert-config-build.md b/docs/samples/bert-config-build.md index e3b25c6e3..5f4af4851 100644 --- a/docs/samples/bert-config-build.md +++ b/docs/samples/bert-config-build.md @@ -39,7 +39,7 @@ This writes a `WinMLBuildConfig` JSON file to `bert_config.json`. The file captu "samples": 10, "calibration_method": "minmax", "task": "text-classification", - "model_name": "bert-base-uncased" + "model_id": "bert-base-uncased" ... // truncated: per_channel, symmetric, distribution, ... }, "compile": null diff --git a/examples/recipes/BAAI_bge-large-en-v1.5/sentence-similarity_w8a16_config.json b/examples/recipes/BAAI_bge-large-en-v1.5/sentence-similarity_w8a16_config.json index 96abec426..c4a92752a 100644 --- a/examples/recipes/BAAI_bge-large-en-v1.5/sentence-similarity_w8a16_config.json +++ b/examples/recipes/BAAI_bge-large-en-v1.5/sentence-similarity_w8a16_config.json @@ -72,7 +72,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "sentence-similarity", - "model_name": "BAAI/bge-large-en-v1.5" + "model_id": "BAAI/bge-large-en-v1.5" }, "loader": { "task": "sentence-similarity", diff --git a/examples/recipes/BAAI_bge-m3/feature-extraction_w8a16_config.json b/examples/recipes/BAAI_bge-m3/feature-extraction_w8a16_config.json index cc1ad56c9..2f3e0f561 100644 --- a/examples/recipes/BAAI_bge-m3/feature-extraction_w8a16_config.json +++ b/examples/recipes/BAAI_bge-m3/feature-extraction_w8a16_config.json @@ -60,7 +60,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "feature-extraction", - "model_name": "BAAI/bge-m3" + "model_id": "BAAI/bge-m3" }, "compile": null, "loader": { diff --git a/examples/recipes/BAAI_bge-m3/sentence-similarity_w8a16_config.json b/examples/recipes/BAAI_bge-m3/sentence-similarity_w8a16_config.json index 06a124129..3a731783e 100644 --- a/examples/recipes/BAAI_bge-m3/sentence-similarity_w8a16_config.json +++ b/examples/recipes/BAAI_bge-m3/sentence-similarity_w8a16_config.json @@ -60,7 +60,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "sentence-similarity", - "model_name": "BAAI/bge-m3" + "model_id": "BAAI/bge-m3" }, "compile": null, "loader": { diff --git a/examples/recipes/FacebookAI_roberta-base/fill-mask_w8a16_config.json b/examples/recipes/FacebookAI_roberta-base/fill-mask_w8a16_config.json index 6ed6b6001..0a54a23b1 100644 --- a/examples/recipes/FacebookAI_roberta-base/fill-mask_w8a16_config.json +++ b/examples/recipes/FacebookAI_roberta-base/fill-mask_w8a16_config.json @@ -60,7 +60,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "fill-mask", - "model_name": "FacebookAI/roberta-base" + "model_id": "FacebookAI/roberta-base" }, "loader": { "task": "fill-mask", diff --git a/examples/recipes/FacebookAI_roberta-large/fill-mask_w8a16_config.json b/examples/recipes/FacebookAI_roberta-large/fill-mask_w8a16_config.json index 27ba8f9fb..f065bb1c0 100644 --- a/examples/recipes/FacebookAI_roberta-large/fill-mask_w8a16_config.json +++ b/examples/recipes/FacebookAI_roberta-large/fill-mask_w8a16_config.json @@ -60,7 +60,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "fill-mask", - "model_name": "FacebookAI/roberta-large" + "model_id": "FacebookAI/roberta-large" }, "compile": null, "loader": { diff --git a/examples/recipes/FacebookAI_xlm-roberta-base/fill-mask_w8a16_config.json b/examples/recipes/FacebookAI_xlm-roberta-base/fill-mask_w8a16_config.json index 6a1bdfddd..c1aa66746 100644 --- a/examples/recipes/FacebookAI_xlm-roberta-base/fill-mask_w8a16_config.json +++ b/examples/recipes/FacebookAI_xlm-roberta-base/fill-mask_w8a16_config.json @@ -60,7 +60,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "fill-mask", - "model_name": "FacebookAI/xlm-roberta-base" + "model_id": "FacebookAI/xlm-roberta-base" }, "loader": { "task": "fill-mask", diff --git a/examples/recipes/Isotonic_distilbert_finetuned_ai4privacy_v2/token-classification_w8a16_config.json b/examples/recipes/Isotonic_distilbert_finetuned_ai4privacy_v2/token-classification_w8a16_config.json index 24755346a..b7ade0e48 100644 --- a/examples/recipes/Isotonic_distilbert_finetuned_ai4privacy_v2/token-classification_w8a16_config.json +++ b/examples/recipes/Isotonic_distilbert_finetuned_ai4privacy_v2/token-classification_w8a16_config.json @@ -58,7 +58,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "token-classification", - "model_name": "Isotonic/distilbert_finetuned_ai4privacy_v2" + "model_id": "Isotonic/distilbert_finetuned_ai4privacy_v2" }, "compile": null, "loader": { diff --git a/examples/recipes/StanfordAIMI_dinov2-base-xray-224/image-feature-extraction_w8a16_config.json b/examples/recipes/StanfordAIMI_dinov2-base-xray-224/image-feature-extraction_w8a16_config.json index 77a6331c7..58055c4a0 100644 --- a/examples/recipes/StanfordAIMI_dinov2-base-xray-224/image-feature-extraction_w8a16_config.json +++ b/examples/recipes/StanfordAIMI_dinov2-base-xray-224/image-feature-extraction_w8a16_config.json @@ -48,7 +48,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "image-feature-extraction", - "model_name": "StanfordAIMI/dinov2-base-xray-224" + "model_id": "StanfordAIMI/dinov2-base-xray-224" }, "compile": null, "loader": { diff --git a/examples/recipes/ahotrod_electra_large_discriminator_squad2_512/question-answering_w8a16_config.json b/examples/recipes/ahotrod_electra_large_discriminator_squad2_512/question-answering_w8a16_config.json index 41a80cc35..db0948cdd 100644 --- a/examples/recipes/ahotrod_electra_large_discriminator_squad2_512/question-answering_w8a16_config.json +++ b/examples/recipes/ahotrod_electra_large_discriminator_squad2_512/question-answering_w8a16_config.json @@ -73,7 +73,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "question-answering", - "model_name": "ahotrod/electra_large_discriminator_squad2_512" + "model_id": "ahotrod/electra_large_discriminator_squad2_512" }, "compile": null, "loader": { diff --git a/examples/recipes/apple_mobilevit-small/image-classification_w8a16_config.json b/examples/recipes/apple_mobilevit-small/image-classification_w8a16_config.json index 034af6e9a..fa07b05de 100644 --- a/examples/recipes/apple_mobilevit-small/image-classification_w8a16_config.json +++ b/examples/recipes/apple_mobilevit-small/image-classification_w8a16_config.json @@ -48,7 +48,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "image-classification", - "model_name": "apple/mobilevit-small" + "model_id": "apple/mobilevit-small" }, "compile": null, "loader": { diff --git a/examples/recipes/cardiffnlp_twitter-roberta-base-sentiment-latest/text-classification_w8a16_config.json b/examples/recipes/cardiffnlp_twitter-roberta-base-sentiment-latest/text-classification_w8a16_config.json index f6b9ea686..eeb1fff26 100644 --- a/examples/recipes/cardiffnlp_twitter-roberta-base-sentiment-latest/text-classification_w8a16_config.json +++ b/examples/recipes/cardiffnlp_twitter-roberta-base-sentiment-latest/text-classification_w8a16_config.json @@ -60,7 +60,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "text-classification", - "model_name": "cardiffnlp/twitter-roberta-base-sentiment-latest" + "model_id": "cardiffnlp/twitter-roberta-base-sentiment-latest" }, "loader": { "task": "text-classification", diff --git a/examples/recipes/dbmdz_bert-large-cased-finetuned-conll03-english/token-classification_w8a16_config.json b/examples/recipes/dbmdz_bert-large-cased-finetuned-conll03-english/token-classification_w8a16_config.json index 195a9ddb6..eb73731b9 100644 --- a/examples/recipes/dbmdz_bert-large-cased-finetuned-conll03-english/token-classification_w8a16_config.json +++ b/examples/recipes/dbmdz_bert-large-cased-finetuned-conll03-english/token-classification_w8a16_config.json @@ -72,7 +72,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "token-classification", - "model_name": "dbmdz/bert-large-cased-finetuned-conll03-english" + "model_id": "dbmdz/bert-large-cased-finetuned-conll03-english" }, "compile": null, "loader": { diff --git a/examples/recipes/deepset_bert-large-uncased-whole-word-masking-squad2/question-answering_w8a16_config.json b/examples/recipes/deepset_bert-large-uncased-whole-word-masking-squad2/question-answering_w8a16_config.json index ff515ccdc..eb65f28cb 100644 --- a/examples/recipes/deepset_bert-large-uncased-whole-word-masking-squad2/question-answering_w8a16_config.json +++ b/examples/recipes/deepset_bert-large-uncased-whole-word-masking-squad2/question-answering_w8a16_config.json @@ -75,7 +75,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "question-answering", - "model_name": "deepset/bert-large-uncased-whole-word-masking-squad2" + "model_id": "deepset/bert-large-uncased-whole-word-masking-squad2" }, "loader": { "task": "question-answering", diff --git a/examples/recipes/deepset_roberta-base-squad2/question-answering_w8a16_config.json b/examples/recipes/deepset_roberta-base-squad2/question-answering_w8a16_config.json index 53deef516..4a645ee95 100644 --- a/examples/recipes/deepset_roberta-base-squad2/question-answering_w8a16_config.json +++ b/examples/recipes/deepset_roberta-base-squad2/question-answering_w8a16_config.json @@ -63,7 +63,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "question-answering", - "model_name": "deepset/roberta-base-squad2" + "model_id": "deepset/roberta-base-squad2" }, "loader": { "task": "question-answering", diff --git a/examples/recipes/deepset_tinyroberta-squad2/question-answering_w8a16_config.json b/examples/recipes/deepset_tinyroberta-squad2/question-answering_w8a16_config.json index 38969b7ab..81e01462c 100644 --- a/examples/recipes/deepset_tinyroberta-squad2/question-answering_w8a16_config.json +++ b/examples/recipes/deepset_tinyroberta-squad2/question-answering_w8a16_config.json @@ -63,7 +63,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "question-answering", - "model_name": "deepset/tinyroberta-squad2" + "model_id": "deepset/tinyroberta-squad2" }, "loader": { "task": "question-answering", diff --git a/examples/recipes/dima806_fairface_age_image_detection/image-classification_w8a16_config.json b/examples/recipes/dima806_fairface_age_image_detection/image-classification_w8a16_config.json index 379bc4caf..882c369ad 100644 --- a/examples/recipes/dima806_fairface_age_image_detection/image-classification_w8a16_config.json +++ b/examples/recipes/dima806_fairface_age_image_detection/image-classification_w8a16_config.json @@ -48,7 +48,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "image-classification", - "model_name": "dima806/fairface_age_image_detection" + "model_id": "dima806/fairface_age_image_detection" }, "compile": null, "loader": { diff --git a/examples/recipes/distilbert_distilbert-base-cased-distilled-squad/question-answering_w8a16_config.json b/examples/recipes/distilbert_distilbert-base-cased-distilled-squad/question-answering_w8a16_config.json index 784ccc775..3c0656499 100644 --- a/examples/recipes/distilbert_distilbert-base-cased-distilled-squad/question-answering_w8a16_config.json +++ b/examples/recipes/distilbert_distilbert-base-cased-distilled-squad/question-answering_w8a16_config.json @@ -61,7 +61,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "question-answering", - "model_name": "distilbert/distilbert-base-cased-distilled-squad" + "model_id": "distilbert/distilbert-base-cased-distilled-squad" }, "compile": null, "loader": { diff --git a/examples/recipes/distilbert_distilbert-base-uncased-distilled-squad/question-answering_w8a16_config.json b/examples/recipes/distilbert_distilbert-base-uncased-distilled-squad/question-answering_w8a16_config.json index 23216becf..04d2166cf 100644 --- a/examples/recipes/distilbert_distilbert-base-uncased-distilled-squad/question-answering_w8a16_config.json +++ b/examples/recipes/distilbert_distilbert-base-uncased-distilled-squad/question-answering_w8a16_config.json @@ -61,7 +61,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "question-answering", - "model_name": "distilbert/distilbert-base-uncased-distilled-squad" + "model_id": "distilbert/distilbert-base-uncased-distilled-squad" }, "compile": null, "loader": { diff --git a/examples/recipes/distilbert_distilbert-base-uncased-finetuned-sst-2-english/text-classification_w8a16_config.json b/examples/recipes/distilbert_distilbert-base-uncased-finetuned-sst-2-english/text-classification_w8a16_config.json index 443f3eb7e..7347853f7 100644 --- a/examples/recipes/distilbert_distilbert-base-uncased-finetuned-sst-2-english/text-classification_w8a16_config.json +++ b/examples/recipes/distilbert_distilbert-base-uncased-finetuned-sst-2-english/text-classification_w8a16_config.json @@ -58,7 +58,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "text-classification", - "model_name": "distilbert/distilbert-base-uncased-finetuned-sst-2-english" + "model_id": "distilbert/distilbert-base-uncased-finetuned-sst-2-english" }, "compile": null, "loader": { diff --git a/examples/recipes/distilbert_distilbert-base-uncased/fill-mask_w8a16_config.json b/examples/recipes/distilbert_distilbert-base-uncased/fill-mask_w8a16_config.json index 6748df0de..e62590fe1 100644 --- a/examples/recipes/distilbert_distilbert-base-uncased/fill-mask_w8a16_config.json +++ b/examples/recipes/distilbert_distilbert-base-uncased/fill-mask_w8a16_config.json @@ -58,7 +58,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "fill-mask", - "model_name": "distilbert/distilbert-base-uncased" + "model_id": "distilbert/distilbert-base-uncased" }, "compile": null, "loader": { diff --git a/examples/recipes/facebook_dino-vitb16/image-feature-extraction_w8a16_config.json b/examples/recipes/facebook_dino-vitb16/image-feature-extraction_w8a16_config.json index f750bc8b1..9af2aec35 100644 --- a/examples/recipes/facebook_dino-vitb16/image-feature-extraction_w8a16_config.json +++ b/examples/recipes/facebook_dino-vitb16/image-feature-extraction_w8a16_config.json @@ -48,7 +48,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "image-feature-extraction", - "model_name": "facebook/dino-vitb16" + "model_id": "facebook/dino-vitb16" }, "loader": { "task": "image-feature-extraction", diff --git a/examples/recipes/facebook_dino-vits16/image-feature-extraction_w8a16_config.json b/examples/recipes/facebook_dino-vits16/image-feature-extraction_w8a16_config.json index 3da2c7432..369890144 100644 --- a/examples/recipes/facebook_dino-vits16/image-feature-extraction_w8a16_config.json +++ b/examples/recipes/facebook_dino-vits16/image-feature-extraction_w8a16_config.json @@ -48,7 +48,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "image-feature-extraction", - "model_name": "facebook/dino-vits16" + "model_id": "facebook/dino-vits16" }, "loader": { "task": "image-feature-extraction", diff --git a/examples/recipes/facebook_dinov2-base/image-feature-extraction_w8a16_config.json b/examples/recipes/facebook_dinov2-base/image-feature-extraction_w8a16_config.json index 95915049a..a92a5f9e0 100644 --- a/examples/recipes/facebook_dinov2-base/image-feature-extraction_w8a16_config.json +++ b/examples/recipes/facebook_dinov2-base/image-feature-extraction_w8a16_config.json @@ -48,7 +48,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "image-feature-extraction", - "model_name": "facebook/dinov2-base" + "model_id": "facebook/dinov2-base" }, "loader": { "task": "image-feature-extraction", diff --git a/examples/recipes/facebook_dinov2-large/image-feature-extraction_w8a16_config.json b/examples/recipes/facebook_dinov2-large/image-feature-extraction_w8a16_config.json index 2d2c0022c..f2d726f65 100644 --- a/examples/recipes/facebook_dinov2-large/image-feature-extraction_w8a16_config.json +++ b/examples/recipes/facebook_dinov2-large/image-feature-extraction_w8a16_config.json @@ -48,7 +48,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "image-feature-extraction", - "model_name": "facebook/dinov2-large" + "model_id": "facebook/dinov2-large" }, "loader": { "task": "image-feature-extraction", diff --git a/examples/recipes/facebook_dinov2-small/image-feature-extraction_w8a16_config.json b/examples/recipes/facebook_dinov2-small/image-feature-extraction_w8a16_config.json index 800258542..3da8acbaf 100644 --- a/examples/recipes/facebook_dinov2-small/image-feature-extraction_w8a16_config.json +++ b/examples/recipes/facebook_dinov2-small/image-feature-extraction_w8a16_config.json @@ -48,7 +48,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "image-feature-extraction", - "model_name": "facebook/dinov2-small" + "model_id": "facebook/dinov2-small" }, "loader": { "task": "image-feature-extraction", diff --git a/examples/recipes/google-bert_bert-base-multilingual-cased/fill-mask_w8a16_config.json b/examples/recipes/google-bert_bert-base-multilingual-cased/fill-mask_w8a16_config.json index 553074f04..ed49fb222 100644 --- a/examples/recipes/google-bert_bert-base-multilingual-cased/fill-mask_w8a16_config.json +++ b/examples/recipes/google-bert_bert-base-multilingual-cased/fill-mask_w8a16_config.json @@ -72,7 +72,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "fill-mask", - "model_name": "google-bert/bert-base-multilingual-cased" + "model_id": "google-bert/bert-base-multilingual-cased" }, "compile": null, "loader": { diff --git a/examples/recipes/google-bert_bert-base-multilingual-uncased/fill-mask_w8a16_config.json b/examples/recipes/google-bert_bert-base-multilingual-uncased/fill-mask_w8a16_config.json index 7b71cc76c..838e34748 100644 --- a/examples/recipes/google-bert_bert-base-multilingual-uncased/fill-mask_w8a16_config.json +++ b/examples/recipes/google-bert_bert-base-multilingual-uncased/fill-mask_w8a16_config.json @@ -72,7 +72,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "fill-mask", - "model_name": "google-bert/bert-base-multilingual-uncased" + "model_id": "google-bert/bert-base-multilingual-uncased" }, "loader": { "task": "fill-mask", diff --git a/examples/recipes/google-bert_bert-base-uncased/fill-mask_w8a16_config.json b/examples/recipes/google-bert_bert-base-uncased/fill-mask_w8a16_config.json index 94669e9e9..de8fd46a1 100644 --- a/examples/recipes/google-bert_bert-base-uncased/fill-mask_w8a16_config.json +++ b/examples/recipes/google-bert_bert-base-uncased/fill-mask_w8a16_config.json @@ -72,7 +72,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "fill-mask", - "model_name": "google-bert/bert-base-uncased" + "model_id": "google-bert/bert-base-uncased" }, "loader": { "task": "fill-mask", diff --git a/examples/recipes/google-bert_bert-large-uncased-whole-word-masking-finetuned-squad/question-answering_w8a16_config.json b/examples/recipes/google-bert_bert-large-uncased-whole-word-masking-finetuned-squad/question-answering_w8a16_config.json index db79310ba..f306fd2cf 100644 --- a/examples/recipes/google-bert_bert-large-uncased-whole-word-masking-finetuned-squad/question-answering_w8a16_config.json +++ b/examples/recipes/google-bert_bert-large-uncased-whole-word-masking-finetuned-squad/question-answering_w8a16_config.json @@ -75,7 +75,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "question-answering", - "model_name": "google-bert/bert-large-uncased-whole-word-masking-finetuned-squad" + "model_id": "google-bert/bert-large-uncased-whole-word-masking-finetuned-squad" }, "compile": null, "loader": { diff --git a/examples/recipes/google_vit-base-patch16-224-in21k/image-feature-extraction_w8a16_config.json b/examples/recipes/google_vit-base-patch16-224-in21k/image-feature-extraction_w8a16_config.json index fa44cce05..ddb1416f5 100644 --- a/examples/recipes/google_vit-base-patch16-224-in21k/image-feature-extraction_w8a16_config.json +++ b/examples/recipes/google_vit-base-patch16-224-in21k/image-feature-extraction_w8a16_config.json @@ -48,7 +48,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "image-feature-extraction", - "model_name": "google/vit-base-patch16-224-in21k" + "model_id": "google/vit-base-patch16-224-in21k" }, "loader": { "task": "image-feature-extraction", diff --git a/examples/recipes/google_vit-base-patch16-224/image-classification_w8a16_config.json b/examples/recipes/google_vit-base-patch16-224/image-classification_w8a16_config.json index d1458e451..f1552eec8 100644 --- a/examples/recipes/google_vit-base-patch16-224/image-classification_w8a16_config.json +++ b/examples/recipes/google_vit-base-patch16-224/image-classification_w8a16_config.json @@ -48,7 +48,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "image-classification", - "model_name": "google/vit-base-patch16-224" + "model_id": "google/vit-base-patch16-224" }, "compile": null, "loader": { diff --git a/examples/recipes/joeddav_xlm-roberta-large-xnli/zero-shot-classification_w8a16_config.json b/examples/recipes/joeddav_xlm-roberta-large-xnli/zero-shot-classification_w8a16_config.json index 5a1abda9e..4003a5ecf 100644 --- a/examples/recipes/joeddav_xlm-roberta-large-xnli/zero-shot-classification_w8a16_config.json +++ b/examples/recipes/joeddav_xlm-roberta-large-xnli/zero-shot-classification_w8a16_config.json @@ -60,7 +60,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "zero-shot-classification", - "model_name": "joeddav/xlm-roberta-large-xnli" + "model_id": "joeddav/xlm-roberta-large-xnli" }, "compile": null, "loader": { diff --git a/examples/recipes/laion_CLIP-ViT-B-32-laion2B-s34B-b79K/feature-extraction_w8a16_config.json b/examples/recipes/laion_CLIP-ViT-B-32-laion2B-s34B-b79K/feature-extraction_w8a16_config.json index c3b498a73..9ee9fe027 100644 --- a/examples/recipes/laion_CLIP-ViT-B-32-laion2B-s34B-b79K/feature-extraction_w8a16_config.json +++ b/examples/recipes/laion_CLIP-ViT-B-32-laion2B-s34B-b79K/feature-extraction_w8a16_config.json @@ -66,7 +66,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "feature-extraction", - "model_name": "laion/CLIP-ViT-B-32-laion2B-s34B-b79K" + "model_id": "laion/CLIP-ViT-B-32-laion2B-s34B-b79K" }, "loader": { "task": "feature-extraction", diff --git a/examples/recipes/mattmdjaga_segformer_b2_clothes/image-segmentation_w8a16_config.json b/examples/recipes/mattmdjaga_segformer_b2_clothes/image-segmentation_w8a16_config.json index 919689732..07490735d 100644 --- a/examples/recipes/mattmdjaga_segformer_b2_clothes/image-segmentation_w8a16_config.json +++ b/examples/recipes/mattmdjaga_segformer_b2_clothes/image-segmentation_w8a16_config.json @@ -48,7 +48,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "image-segmentation", - "model_name": "mattmdjaga/segformer_b2_clothes" + "model_id": "mattmdjaga/segformer_b2_clothes" }, "compile": null, "loader": { diff --git a/examples/recipes/microsoft_rad-dino/image-feature-extraction_w8a16_config.json b/examples/recipes/microsoft_rad-dino/image-feature-extraction_w8a16_config.json index 5e037c79c..10dc898f5 100644 --- a/examples/recipes/microsoft_rad-dino/image-feature-extraction_w8a16_config.json +++ b/examples/recipes/microsoft_rad-dino/image-feature-extraction_w8a16_config.json @@ -48,7 +48,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "image-feature-extraction", - "model_name": "microsoft/rad-dino" + "model_id": "microsoft/rad-dino" }, "loader": { "task": "image-feature-extraction", diff --git a/examples/recipes/microsoft_resnet-18/image-classification_w8a16_config.json b/examples/recipes/microsoft_resnet-18/image-classification_w8a16_config.json index 6e2a421c4..25a4e1ab1 100644 --- a/examples/recipes/microsoft_resnet-18/image-classification_w8a16_config.json +++ b/examples/recipes/microsoft_resnet-18/image-classification_w8a16_config.json @@ -48,7 +48,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "image-classification", - "model_name": "microsoft/resnet-18" + "model_id": "microsoft/resnet-18" }, "compile": null, "loader": { diff --git a/examples/recipes/microsoft_resnet-50/image-classification_w8a16_config.json b/examples/recipes/microsoft_resnet-50/image-classification_w8a16_config.json index 17a0831ac..30ed89f00 100644 --- a/examples/recipes/microsoft_resnet-50/image-classification_w8a16_config.json +++ b/examples/recipes/microsoft_resnet-50/image-classification_w8a16_config.json @@ -48,7 +48,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "image-classification", - "model_name": "microsoft/resnet-50" + "model_id": "microsoft/resnet-50" }, "compile": null, "loader": { diff --git a/examples/recipes/microsoft_swin-large-patch4-window7-224/image-classification_w8a16_config.json b/examples/recipes/microsoft_swin-large-patch4-window7-224/image-classification_w8a16_config.json index 4f5349f3e..b4f44aeb5 100644 --- a/examples/recipes/microsoft_swin-large-patch4-window7-224/image-classification_w8a16_config.json +++ b/examples/recipes/microsoft_swin-large-patch4-window7-224/image-classification_w8a16_config.json @@ -48,7 +48,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "image-classification", - "model_name": "microsoft/swin-large-patch4-window7-224" + "model_id": "microsoft/swin-large-patch4-window7-224" }, "compile": null, "loader": { diff --git a/examples/recipes/monologg_koelectra-small-v2-distilled-korquad-384/question-answering_w8a16_config.json b/examples/recipes/monologg_koelectra-small-v2-distilled-korquad-384/question-answering_w8a16_config.json index 4af062dc0..0b9904b22 100644 --- a/examples/recipes/monologg_koelectra-small-v2-distilled-korquad-384/question-answering_w8a16_config.json +++ b/examples/recipes/monologg_koelectra-small-v2-distilled-korquad-384/question-answering_w8a16_config.json @@ -73,7 +73,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "question-answering", - "model_name": "monologg/koelectra-small-v2-distilled-korquad-384" + "model_id": "monologg/koelectra-small-v2-distilled-korquad-384" }, "compile": null, "loader": { diff --git a/examples/recipes/nvidia_segformer-b1-finetuned-ade-512-512/image-segmentation_w8a16_config.json b/examples/recipes/nvidia_segformer-b1-finetuned-ade-512-512/image-segmentation_w8a16_config.json index 924497cdd..29f4fef13 100644 --- a/examples/recipes/nvidia_segformer-b1-finetuned-ade-512-512/image-segmentation_w8a16_config.json +++ b/examples/recipes/nvidia_segformer-b1-finetuned-ade-512-512/image-segmentation_w8a16_config.json @@ -48,7 +48,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "image-segmentation", - "model_name": "nvidia/segformer-b1-finetuned-ade-512-512" + "model_id": "nvidia/segformer-b1-finetuned-ade-512-512" }, "compile": null, "loader": { diff --git a/examples/recipes/nvidia_segformer-b2-finetuned-ade-512-512/image-segmentation_w8a16_config.json b/examples/recipes/nvidia_segformer-b2-finetuned-ade-512-512/image-segmentation_w8a16_config.json index 0848f1fc8..b20372d0b 100644 --- a/examples/recipes/nvidia_segformer-b2-finetuned-ade-512-512/image-segmentation_w8a16_config.json +++ b/examples/recipes/nvidia_segformer-b2-finetuned-ade-512-512/image-segmentation_w8a16_config.json @@ -48,7 +48,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "image-segmentation", - "model_name": "nvidia/segformer-b2-finetuned-ade-512-512" + "model_id": "nvidia/segformer-b2-finetuned-ade-512-512" }, "compile": null, "loader": { diff --git a/examples/recipes/nvidia_segformer-b5-finetuned-ade-640-640/image-segmentation_w8a16_config.json b/examples/recipes/nvidia_segformer-b5-finetuned-ade-640-640/image-segmentation_w8a16_config.json index dd5c11a96..55401dac6 100644 --- a/examples/recipes/nvidia_segformer-b5-finetuned-ade-640-640/image-segmentation_w8a16_config.json +++ b/examples/recipes/nvidia_segformer-b5-finetuned-ade-640-640/image-segmentation_w8a16_config.json @@ -48,7 +48,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "image-segmentation", - "model_name": "nvidia/segformer-b5-finetuned-ade-640-640" + "model_id": "nvidia/segformer-b5-finetuned-ade-640-640" }, "compile": null, "loader": { diff --git a/examples/recipes/openai_clip-vit-base-patch16/feature-extraction_w8a16_config.json b/examples/recipes/openai_clip-vit-base-patch16/feature-extraction_w8a16_config.json index 5745e9ca0..5cb2623b1 100644 --- a/examples/recipes/openai_clip-vit-base-patch16/feature-extraction_w8a16_config.json +++ b/examples/recipes/openai_clip-vit-base-patch16/feature-extraction_w8a16_config.json @@ -66,7 +66,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "feature-extraction", - "model_name": "openai/clip-vit-base-patch16" + "model_id": "openai/clip-vit-base-patch16" }, "loader": { "task": "feature-extraction", diff --git a/examples/recipes/openai_clip-vit-base-patch32/feature-extraction_w8a16_config.json b/examples/recipes/openai_clip-vit-base-patch32/feature-extraction_w8a16_config.json index 0dde45b25..ce8c32376 100644 --- a/examples/recipes/openai_clip-vit-base-patch32/feature-extraction_w8a16_config.json +++ b/examples/recipes/openai_clip-vit-base-patch32/feature-extraction_w8a16_config.json @@ -66,7 +66,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "feature-extraction", - "model_name": "openai/clip-vit-base-patch32" + "model_id": "openai/clip-vit-base-patch32" }, "loader": { "task": "feature-extraction", diff --git a/examples/recipes/openai_clip-vit-large-patch14-336/zero-shot-image-classification_w8a16_config_image-encoder.json b/examples/recipes/openai_clip-vit-large-patch14-336/zero-shot-image-classification_w8a16_config_image-encoder.json index c10bff087..54605d483 100644 --- a/examples/recipes/openai_clip-vit-large-patch14-336/zero-shot-image-classification_w8a16_config_image-encoder.json +++ b/examples/recipes/openai_clip-vit-large-patch14-336/zero-shot-image-classification_w8a16_config_image-encoder.json @@ -56,7 +56,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "image-feature-extraction", - "model_name": "openai/clip-vit-large-patch14-336" + "model_id": "openai/clip-vit-large-patch14-336" }, "compile": null, "loader": { diff --git a/examples/recipes/openai_clip-vit-large-patch14-336/zero-shot-image-classification_w8a16_config_text-encoder.json b/examples/recipes/openai_clip-vit-large-patch14-336/zero-shot-image-classification_w8a16_config_text-encoder.json index a3591f781..84e450f6e 100644 --- a/examples/recipes/openai_clip-vit-large-patch14-336/zero-shot-image-classification_w8a16_config_text-encoder.json +++ b/examples/recipes/openai_clip-vit-large-patch14-336/zero-shot-image-classification_w8a16_config_text-encoder.json @@ -66,7 +66,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "feature-extraction", - "model_name": "openai/clip-vit-large-patch14-336" + "model_id": "openai/clip-vit-large-patch14-336" }, "compile": null, "loader": { diff --git a/examples/recipes/openai_clip-vit-large-patch14/zero-shot-image-classification_w8a16_config_image-encoder.json b/examples/recipes/openai_clip-vit-large-patch14/zero-shot-image-classification_w8a16_config_image-encoder.json index e6236da7e..f2b0654a4 100644 --- a/examples/recipes/openai_clip-vit-large-patch14/zero-shot-image-classification_w8a16_config_image-encoder.json +++ b/examples/recipes/openai_clip-vit-large-patch14/zero-shot-image-classification_w8a16_config_image-encoder.json @@ -56,7 +56,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "image-feature-extraction", - "model_name": "openai/clip-vit-large-patch14" + "model_id": "openai/clip-vit-large-patch14" }, "compile": null, "loader": { diff --git a/examples/recipes/openai_clip-vit-large-patch14/zero-shot-image-classification_w8a16_config_text-encoder.json b/examples/recipes/openai_clip-vit-large-patch14/zero-shot-image-classification_w8a16_config_text-encoder.json index 222a26f34..b09739ffd 100644 --- a/examples/recipes/openai_clip-vit-large-patch14/zero-shot-image-classification_w8a16_config_text-encoder.json +++ b/examples/recipes/openai_clip-vit-large-patch14/zero-shot-image-classification_w8a16_config_text-encoder.json @@ -66,7 +66,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "feature-extraction", - "model_name": "openai/clip-vit-large-patch14" + "model_id": "openai/clip-vit-large-patch14" }, "compile": null, "loader": { diff --git a/examples/recipes/rizvandwiki_gender-classification/image-classification_w8a16_config.json b/examples/recipes/rizvandwiki_gender-classification/image-classification_w8a16_config.json index 2e43e5ab2..d50353492 100644 --- a/examples/recipes/rizvandwiki_gender-classification/image-classification_w8a16_config.json +++ b/examples/recipes/rizvandwiki_gender-classification/image-classification_w8a16_config.json @@ -48,7 +48,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "image-classification", - "model_name": "rizvandwiki/gender-classification" + "model_id": "rizvandwiki/gender-classification" }, "loader": { "task": "image-classification", diff --git a/examples/recipes/sentence-transformers_all-MiniLM-L6-v2/feature-extraction_w8a16_config.json b/examples/recipes/sentence-transformers_all-MiniLM-L6-v2/feature-extraction_w8a16_config.json index 77ccc0498..21341d059 100644 --- a/examples/recipes/sentence-transformers_all-MiniLM-L6-v2/feature-extraction_w8a16_config.json +++ b/examples/recipes/sentence-transformers_all-MiniLM-L6-v2/feature-extraction_w8a16_config.json @@ -72,7 +72,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "feature-extraction", - "model_name": "sentence-transformers/all-MiniLM-L6-v2" + "model_id": "sentence-transformers/all-MiniLM-L6-v2" }, "loader": { "task": "feature-extraction", diff --git a/examples/recipes/sentence-transformers_all-MiniLM-L6-v2/sentence-similarity_w8a16_config.json b/examples/recipes/sentence-transformers_all-MiniLM-L6-v2/sentence-similarity_w8a16_config.json index 28962bc74..b497e99f8 100644 --- a/examples/recipes/sentence-transformers_all-MiniLM-L6-v2/sentence-similarity_w8a16_config.json +++ b/examples/recipes/sentence-transformers_all-MiniLM-L6-v2/sentence-similarity_w8a16_config.json @@ -72,7 +72,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "sentence-similarity", - "model_name": "sentence-transformers/all-MiniLM-L6-v2" + "model_id": "sentence-transformers/all-MiniLM-L6-v2" }, "loader": { "task": "sentence-similarity", diff --git a/examples/recipes/sentence-transformers_all-mpnet-base-v2/feature-extraction_w8a16_config.json b/examples/recipes/sentence-transformers_all-mpnet-base-v2/feature-extraction_w8a16_config.json index b88600555..5cf0286cf 100644 --- a/examples/recipes/sentence-transformers_all-mpnet-base-v2/feature-extraction_w8a16_config.json +++ b/examples/recipes/sentence-transformers_all-mpnet-base-v2/feature-extraction_w8a16_config.json @@ -58,7 +58,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "feature-extraction", - "model_name": "sentence-transformers/all-mpnet-base-v2" + "model_id": "sentence-transformers/all-mpnet-base-v2" }, "compile": null, "loader": { diff --git a/examples/recipes/sentence-transformers_all-mpnet-base-v2/sentence-similarity_w8a16_config.json b/examples/recipes/sentence-transformers_all-mpnet-base-v2/sentence-similarity_w8a16_config.json index 345311097..06b918dbc 100644 --- a/examples/recipes/sentence-transformers_all-mpnet-base-v2/sentence-similarity_w8a16_config.json +++ b/examples/recipes/sentence-transformers_all-mpnet-base-v2/sentence-similarity_w8a16_config.json @@ -58,7 +58,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "sentence-similarity", - "model_name": "sentence-transformers/all-mpnet-base-v2" + "model_id": "sentence-transformers/all-mpnet-base-v2" }, "compile": null, "loader": { diff --git a/examples/recipes/sentence-transformers_multi-qa-mpnet-base-dot-v1/feature-extraction_w8a16_config.json b/examples/recipes/sentence-transformers_multi-qa-mpnet-base-dot-v1/feature-extraction_w8a16_config.json index e00cce6f7..c20334353 100644 --- a/examples/recipes/sentence-transformers_multi-qa-mpnet-base-dot-v1/feature-extraction_w8a16_config.json +++ b/examples/recipes/sentence-transformers_multi-qa-mpnet-base-dot-v1/feature-extraction_w8a16_config.json @@ -58,7 +58,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "feature-extraction", - "model_name": "sentence-transformers/multi-qa-mpnet-base-dot-v1" + "model_id": "sentence-transformers/multi-qa-mpnet-base-dot-v1" }, "compile": null, "loader": { diff --git a/examples/recipes/sentence-transformers_multi-qa-mpnet-base-dot-v1/sentence-similarity_w8a16_config.json b/examples/recipes/sentence-transformers_multi-qa-mpnet-base-dot-v1/sentence-similarity_w8a16_config.json index 500b5ff1a..20bb96038 100644 --- a/examples/recipes/sentence-transformers_multi-qa-mpnet-base-dot-v1/sentence-similarity_w8a16_config.json +++ b/examples/recipes/sentence-transformers_multi-qa-mpnet-base-dot-v1/sentence-similarity_w8a16_config.json @@ -58,7 +58,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "sentence-similarity", - "model_name": "sentence-transformers/multi-qa-mpnet-base-dot-v1" + "model_id": "sentence-transformers/multi-qa-mpnet-base-dot-v1" }, "compile": null, "loader": { diff --git a/examples/recipes/sentence-transformers_paraphrase-multilingual-mpnet-base-v2/sentence-similarity_w8a16_config.json b/examples/recipes/sentence-transformers_paraphrase-multilingual-mpnet-base-v2/sentence-similarity_w8a16_config.json index 8cba69ee4..2632bd575 100644 --- a/examples/recipes/sentence-transformers_paraphrase-multilingual-mpnet-base-v2/sentence-similarity_w8a16_config.json +++ b/examples/recipes/sentence-transformers_paraphrase-multilingual-mpnet-base-v2/sentence-similarity_w8a16_config.json @@ -60,7 +60,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "sentence-similarity", - "model_name": "sentence-transformers/paraphrase-multilingual-mpnet-base-v2" + "model_id": "sentence-transformers/paraphrase-multilingual-mpnet-base-v2" }, "loader": { "task": "sentence-similarity", diff --git a/examples/recipes/w11wo_indonesian-roberta-base-posp-tagger/token-classification_w8a16_config.json b/examples/recipes/w11wo_indonesian-roberta-base-posp-tagger/token-classification_w8a16_config.json index d00b21020..d585a8a5e 100644 --- a/examples/recipes/w11wo_indonesian-roberta-base-posp-tagger/token-classification_w8a16_config.json +++ b/examples/recipes/w11wo_indonesian-roberta-base-posp-tagger/token-classification_w8a16_config.json @@ -60,7 +60,7 @@ "op_types_to_quantize": null, "nodes_to_exclude": null, "task": "token-classification", - "model_name": "w11wo/indonesian-roberta-base-posp-tagger" + "model_id": "w11wo/indonesian-roberta-base-posp-tagger" }, "loader": { "task": "token-classification", diff --git a/src/winml/modelkit/commands/build.py b/src/winml/modelkit/commands/build.py index 95029c00e..288ffc3fe 100644 --- a/src/winml/modelkit/commands/build.py +++ b/src/winml/modelkit/commands/build.py @@ -624,10 +624,10 @@ def _patch_device(cfg: WinMLBuildConfig) -> None: if cfg.loader is not None and cfg.loader.task: resolved_quant.task = cfg.loader.task if model_id: - resolved_quant.model_name = model_id + resolved_quant.model_id = model_id cfg.quant = resolved_quant else: - # Only update precision fields; preserve task/model_name + # Only update precision fields; preserve task/model_id # and other calibration settings from the existing config. cfg.quant.weight_type = resolved_quant.weight_type cfg.quant.activation_type = resolved_quant.activation_type diff --git a/src/winml/modelkit/commands/eval.py b/src/winml/modelkit/commands/eval.py index fb13096c2..1bad01f58 100644 --- a/src/winml/modelkit/commands/eval.py +++ b/src/winml/modelkit/commands/eval.py @@ -40,11 +40,8 @@ "(requires --model-id), or split-encoder role=path pairs (see --schema)." ), ) -@click.option( - "--model-id", - type=str, - default=None, - help="HuggingFace model ID when .onnx model file is provided in --model.", +@cli_utils.model_id_option( + help_text="HuggingFace model ID when .onnx model file is provided in --model.", ) @click.option( "--dataset", diff --git a/src/winml/modelkit/commands/quantize.py b/src/winml/modelkit/commands/quantize.py index 902ea3144..0e10518cc 100644 --- a/src/winml/modelkit/commands/quantize.py +++ b/src/winml/modelkit/commands/quantize.py @@ -98,11 +98,8 @@ default=None, help="Task for calibration dataset selection (e.g., 'image-classification').", ) -@click.option( - "--model-name", - type=str, - default=None, - help="HuggingFace model name (e.g., 'microsoft/resnet-50'). When provided " +@cli_utils.model_id_option( + help_text="HuggingFace model id (e.g., 'microsoft/resnet-50'). When provided " "with --task, enables task-aware calibration datasets using the model's preprocessor.", ) @cli_utils.build_config_option() @@ -120,7 +117,7 @@ def quantize( per_channel: bool, symmetric: bool, task: str | None, - model_name: str | None, + model_id: str | None, verbose: int, quiet: bool, config_file: Path | None, @@ -177,8 +174,8 @@ def quantize( symmetric = qc["symmetric"] if not cli_utils.is_cli_provided(ctx, "task") and "task" in qc: task = qc["task"] - if not cli_utils.is_cli_provided(ctx, "model_name") and "model_name" in qc: - model_name = qc["model_name"] + if not cli_utils.is_cli_provided(ctx, "model_id") and "model_id" in qc: + model_id = qc["model_id"] # Import quantizer (late import to speed up CLI) from ..quant import WinMLQuantizationConfig, quantize_onnx @@ -226,7 +223,7 @@ def quantize( per_channel=per_channel, symmetric=symmetric, task=task, - model_name=model_name, + model_id=model_id, ) label = "Quantization" diff --git a/src/winml/modelkit/config/build.py b/src/winml/modelkit/config/build.py index 6ca550d15..b25f6cc75 100644 --- a/src/winml/modelkit/config/build.py +++ b/src/winml/modelkit/config/build.py @@ -207,7 +207,7 @@ def validate(self) -> None: Build types: - HF build (export is not None): requires loader.task, quant.task, - quant.model_name when quant is enabled + quant.model_id when quant is enabled - ONNX build (export is None): relaxed — loader.task and quant fields are optional since the ONNX model is pre-exported @@ -229,18 +229,18 @@ def validate(self) -> None: errors.append("optim config is required") # type: ignore[unreachable] # 3. quant validation (when present) - # Exceptions: ONNX builds (export=None) don't need quant.task/model_name + # Exceptions: ONNX builds (export=None) don't need quant.task/model_id # because the ONNX model is pre-exported. Submodule builds (module_path # set) use RandomDataset which only needs the ONNX model_path. # Algorithms that skip calibration (fp16, rtn, dynamic) also don't - # need task/model_name since they don't generate calibration datasets. + # need task/model_id since they don't generate calibration datasets. if self.quant is not None: needs_calibration = self.quant.mode == "static" needs_quant_ids = not is_onnx_build and not is_submodule and needs_calibration if needs_quant_ids and not self.quant.task: errors.append("quant.task is required when quant is enabled for HF builds") - if needs_quant_ids and not self.quant.model_name: - errors.append("quant.model_name is required when quant is enabled for HF builds") + if needs_quant_ids and not self.quant.model_id: + errors.append("quant.model_id is required when quant is enabled for HF builds") # 4. compile validation (when present) if self.compile is not None and ( @@ -920,7 +920,7 @@ def _build_submodule_config( - Inherited model_type from parent; task intentionally omitted - module_path and model_class from sub_info - Inherited optim/compile from parent - - Quant with task=None, model_name=None (RandomDataset fallback) + - Quant with task=None, model_id=None (RandomDataset fallback) """ # Build InputTensorSpec for EACH input tensor (not just the first). @@ -962,12 +962,12 @@ def _input_name(i: int) -> str: ), optim=copy.deepcopy(parent_config.optim), # Submodule builds use RandomDataset for calibration: - # quantize_onnx() falls back to "random" when task/model_name are None, + # quantize_onnx() falls back to "random" when task/model_id are None, # and RandomDataset reads input specs from the ONNX model file. quant=WinMLQuantizationConfig( samples=1, task=None, - model_name=None, + model_id=None, ), compile=copy.deepcopy(parent_config.compile), ) @@ -1030,14 +1030,14 @@ def _assemble_config( """Assemble WinMLBuildConfig from resolved loader and export configs. Handles optim/quant/compile from the registry or defaults, - and populates quant config with task and model_name. + and populates quant config with task and model_id. Args: loader_config: Resolved WinMLLoaderConfig (from resolve_loader_config). export_config: Resolved WinMLExportConfig (from registry or _resolve_export_config_from_specs). registered: Registered config from MODEL_BUILD_CONFIGS (or None). - model_id: HuggingFace model ID (for quant model_name), or None. + model_id: HuggingFace model ID (for quant model_id), or None. model_type: Parent HF model type (for quant fallback name). Returns: @@ -1061,16 +1061,16 @@ def _assemble_config( else WinMLCompileConfig() ) - # Populate quant config with task and model_name for task-aware calibration + # Populate quant config with task and model_id for task-aware calibration if quant_config: quant_config.task = loader_config.task if model_id is None and model_type is not None: logger.warning( - "Quantization model_name set to '%s' (model type). " + "Quantization model_id set to '%s' (model type). " "For calibration datasets, provide --model with a full model ID.", model_type, ) - quant_config.model_name = model_id or model_type + quant_config.model_id = model_id or model_type return WinMLBuildConfig( loader=loader_config, diff --git a/src/winml/modelkit/quant/config.py b/src/winml/modelkit/quant/config.py index 40f443917..3be32ed0a 100644 --- a/src/winml/modelkit/quant/config.py +++ b/src/winml/modelkit/quant/config.py @@ -67,7 +67,7 @@ class WinMLQuantizationConfig: # Task-aware calibration (used when calibration_data is None) task: str | None = None # e.g., "image-classification" - model_name: str | None = None # e.g., "microsoft/resnet-50" + model_id: str | None = None # e.g., "microsoft/resnet-50" dataset_name: str | None = None # Optional: override default dataset # Quantization types (static/dynamic) @@ -106,7 +106,7 @@ def to_dict(self) -> dict: Includes all fields that affect quantization behavior so that ``generate_cache_key()`` produces distinct hashes for distinct configs. - Optional fields (task, model_name, dataset_name) are omitted when None + Optional fields (task, model_id, dataset_name) are omitted when None to keep submodule configs clean. """ result: dict = { @@ -131,8 +131,8 @@ def to_dict(self) -> dict: } if self.task is not None: result["task"] = self.task - if self.model_name is not None: - result["model_name"] = self.model_name + if self.model_id is not None: + result["model_id"] = self.model_id if self.dataset_name is not None: result["dataset_name"] = self.dataset_name if self.mode == "rtn": @@ -165,7 +165,7 @@ def from_dict(cls, data: dict) -> WinMLQuantizationConfig: samples=data.get("samples", data.get("calibration_samples", 10)), calibration_method=data.get("calibration_method", "minmax"), task=data.get("task"), - model_name=data.get("model_name"), + model_id=data.get("model_id"), dataset_name=data.get("dataset_name"), weight_type=data.get("weight_type", "uint8"), activation_type=data.get("activation_type", "uint8"), diff --git a/src/winml/modelkit/quant/quantizer.py b/src/winml/modelkit/quant/quantizer.py index d46096164..8b9ba033f 100644 --- a/src/winml/modelkit/quant/quantizer.py +++ b/src/winml/modelkit/quant/quantizer.py @@ -281,7 +281,7 @@ def _quantize_qdq( task = config.task or "random" data_reader = DatasetCalibrationReader( - model_name=config.model_name or "random", + model_name=config.model_id or "random", task=task, max_samples=config.samples, dataset_name=config.dataset_name, diff --git a/src/winml/modelkit/utils/cli.py b/src/winml/modelkit/utils/cli.py index 312fff9cc..29ab0d9a2 100644 --- a/src/winml/modelkit/utils/cli.py +++ b/src/winml/modelkit/utils/cli.py @@ -136,6 +136,28 @@ def model_option(required: bool = True, optional_message: str | None = None) -> ) +def model_id_option(help_text: str | None = None) -> Callable[[F], F]: + """Add ``--model-id`` option for a HuggingFace model ID. + + Shared by commands (e.g. ``quantize`` and ``eval``) that take an ONNX model + path via ``-m/--model`` and need a separate HuggingFace model ID, for example + to resolve the matching preprocessor/tokenizer or calibration datasets. + + Args: + help_text: Optional override for the help string. + + Returns: + Decorator function. + """ + help = help_text or "HuggingFace model ID (e.g., 'microsoft/resnet-50')." + return click.option( + "--model-id", + type=str, + default=None, + help=help, + ) + + def output_option(help_text: str, required: bool = False) -> Callable[[F], F]: """Add ``-o/--output`` option that accepts a file path. diff --git a/tests/e2e/test_quantize_e2e.py b/tests/e2e/test_quantize_e2e.py index 8e17b57bb..94882dde1 100644 --- a/tests/e2e/test_quantize_e2e.py +++ b/tests/e2e/test_quantize_e2e.py @@ -163,7 +163,7 @@ def onnx_imgseg(tmp_path_factory: pytest.TempPathFactory) -> Path: segmentation I/O instead, so calibration still exercises the ImageSegmentationDataset path without running a large model. The dataset itself (image processor + samples) is still loaded from the real - ``--model-name`` in the test. + ``--model-id`` in the test. """ d = tmp_path_factory.mktemp("fake_imgseg") p = d / "model.onnx" @@ -517,7 +517,7 @@ def test_task_image_classification_dataset( str(out), "--task", "image-classification", - "--model-name", + "--model-id", "microsoft/resnet-50", "--samples", "4", @@ -541,7 +541,7 @@ def test_task_text_classification_dataset( str(out), "--task", "text-classification", - "--model-name", + "--model-id", "Intel/bert-base-uncased-mrpc", "--samples", "4", @@ -565,7 +565,7 @@ def test_task_object_detection_dataset( str(out), "--task", "object-detection", - "--model-name", + "--model-id", "hustvl/yolos-small", "--samples", "4", @@ -589,7 +589,7 @@ def test_task_image_segmentation_dataset( str(out), "--task", "image-segmentation", - "--model-name", + "--model-id", "nvidia/segformer-b0-finetuned-ade-512-512", "--samples", "4", @@ -643,7 +643,7 @@ def test_image_feature_extraction_uses_image_dataset( str(out), "--task", "image-feature-extraction", - "--model-name", + "--model-id", "facebook/dinov2-small", "--samples", "4", diff --git a/tests/unit/build/test_hf.py b/tests/unit/build/test_hf.py index 38203af44..e59c419f4 100644 --- a/tests/unit/build/test_hf.py +++ b/tests/unit/build/test_hf.py @@ -40,7 +40,7 @@ def sample_config(): "mode": "qdq", "samples": 10, "task": "image-classification", - "model_name": "test-model", + "model_id": "test-model", }, "compile": { "execution_provider": "qnn", diff --git a/tests/unit/build/test_onnx.py b/tests/unit/build/test_onnx.py index 79fb108cf..2099ddfbf 100644 --- a/tests/unit/build/test_onnx.py +++ b/tests/unit/build/test_onnx.py @@ -39,7 +39,7 @@ def sample_onnx_config(): "mode": "qdq", "samples": 10, "task": "image-classification", - "model_name": "test-model", + "model_id": "test-model", }, "compile": { "execution_provider": "qnn", diff --git a/tests/unit/commands/test_build.py b/tests/unit/commands/test_build.py index 07981ea4b..1bc5b7c41 100644 --- a/tests/unit/commands/test_build.py +++ b/tests/unit/commands/test_build.py @@ -109,7 +109,7 @@ def sample_config_file(tmp_path: Path) -> Path: "mode": "qdq", "samples": 10, "task": "image-classification", - "model_name": "test", + "model_id": "test", }, "compile": {"execution_provider": "qnn"}, } @@ -463,7 +463,7 @@ def test_no_quant_clears_quant(self, tmp_path: Path, mock_run_single_build: Magi "mode": "qdq", "samples": 10, "task": "image-classification", - "model_name": "test", + "model_id": "test", }, "compile": None, } @@ -617,7 +617,7 @@ def test_precision_alone_triggers_quant_patch( "mode": "qdq", "samples": 10, "task": "image-classification", - "model_name": "test", + "model_id": "test", }, "compile": None, } diff --git a/tests/unit/commands/test_build_module.py b/tests/unit/commands/test_build_module.py index 0a7a1bb3b..916e76fe2 100644 --- a/tests/unit/commands/test_build_module.py +++ b/tests/unit/commands/test_build_module.py @@ -95,7 +95,7 @@ def test_array_config_applies_no_quant(self, tmp_path: Path) -> None: }, "export": {}, "optim": {}, - "quant": {"task": "fill-mask", "model_name": "X", "samples": 1}, + "quant": {"task": "fill-mask", "model_id": "X", "samples": 1}, }, ] ) diff --git a/tests/unit/config/test_build.py b/tests/unit/config/test_build.py index ce7426ccd..3e1d8449b 100644 --- a/tests/unit/config/test_build.py +++ b/tests/unit/config/test_build.py @@ -715,7 +715,7 @@ def test_submodule_omits_task(self, parent_config: WinMLBuildConfig) -> None: assert result.loader.module_path == "encoder.layer.0.attention" def test_quant_uses_random_for_submodules(self, parent_config: WinMLBuildConfig) -> None: - """Submodule quant uses random dataset (task=None, model_name=None).""" + """Submodule quant uses random dataset (task=None, model_id=None).""" parent_config.loader.task = "fill-mask" parent_config.loader.model_type = "bert" @@ -733,7 +733,7 @@ def test_quant_uses_random_for_submodules(self, parent_config: WinMLBuildConfig) # Submodule quant should exist with task=None (random dataset fallback) assert result.quant is not None assert result.quant.task is None - assert result.quant.model_name is None + assert result.quant.model_id is None assert result.quant.samples == 1 def test_submodule_config_with_quant_passes_validate( @@ -762,7 +762,7 @@ def test_submodule_quant_omits_task_in_json( self, parent_config: WinMLBuildConfig, ) -> None: - """Submodule quant serialization omits task, model_name, dataset_name when None.""" + """Submodule quant serialization omits task, model_id, dataset_name when None.""" parent_config.loader.task = "fill-mask" parent_config.loader.model_type = "bert" @@ -779,7 +779,7 @@ def test_submodule_quant_omits_task_in_json( quant_dict = result.quant.to_dict() assert "task" not in quant_dict - assert "model_name" not in quant_dict + assert "model_id" not in quant_dict assert "dataset_name" not in quant_dict def test_empty_inputs(self, parent_config: WinMLBuildConfig) -> None: @@ -1719,7 +1719,7 @@ def test_valid_config_passes(self) -> None: optim=WinMLOptimizationConfig(), quant=WinMLQuantizationConfig( task="image-classification", - model_name="microsoft/resnet-50", + model_id="microsoft/resnet-50", ), compile=WinMLCompileConfig( ep_config=EPConfig(provider="qnn"), @@ -1776,12 +1776,12 @@ def test_valid_onnx_build_no_loader_task(self) -> None: config.validate() # Should not raise def test_valid_onnx_build_with_quant_no_task(self) -> None: - """ONNX build with quant doesn't require quant.task or quant.model_name.""" + """ONNX build with quant doesn't require quant.task or quant.model_id.""" config = WinMLBuildConfig( loader=WinMLLoaderConfig(task=None), export=None, # ONNX build optim=WinMLOptimizationConfig(), - quant=WinMLQuantizationConfig(task=None, model_name=None), + quant=WinMLQuantizationConfig(task=None, model_id=None), compile=WinMLCompileConfig(), ) config.validate() # Should not raise @@ -1804,22 +1804,22 @@ def test_quant_missing_task_raises(self) -> None: loader=WinMLLoaderConfig(task="fill-mask"), export=WinMLExportConfig(), # HF build optim=WinMLOptimizationConfig(), - quant=WinMLQuantizationConfig(task=None, model_name="test-model"), + quant=WinMLQuantizationConfig(task=None, model_id="test-model"), compile=None, ) with pytest.raises(ValueError, match=r"quant\.task is required"): config.validate() - def test_quant_missing_model_name_raises(self) -> None: - """quant enabled but model_name=None raises ValueError for HF builds.""" + def test_quant_missing_model_id_raises(self) -> None: + """quant enabled but model_id=None raises ValueError for HF builds.""" config = WinMLBuildConfig( loader=WinMLLoaderConfig(task="fill-mask"), export=WinMLExportConfig(), # HF build optim=WinMLOptimizationConfig(), - quant=WinMLQuantizationConfig(task="fill-mask", model_name=None), + quant=WinMLQuantizationConfig(task="fill-mask", model_id=None), compile=None, ) - with pytest.raises(ValueError, match=r"quant\.model_name is required"): + with pytest.raises(ValueError, match=r"quant\.model_id is required"): config.validate() def test_compile_missing_provider_raises(self) -> None: @@ -1842,7 +1842,7 @@ def test_multiple_errors_collected_hf_build(self) -> None: loader=WinMLLoaderConfig(task=None), export=WinMLExportConfig(), # HF build (export present) optim=None, - quant=WinMLQuantizationConfig(task=None, model_name=None), + quant=WinMLQuantizationConfig(task=None, model_id=None), compile=WinMLCompileConfig(ep_config=EPConfig(provider="")), ) with pytest.raises(ValueError, match="Invalid WinMLBuildConfig") as exc_info: @@ -1852,7 +1852,7 @@ def test_multiple_errors_collected_hf_build(self) -> None: assert "loader.task is required for full model builds" in error_msg assert "optim config is required" in error_msg assert "quant.task is required when quant is enabled for HF builds" in error_msg - assert "quant.model_name is required when quant is enabled for HF builds" in error_msg + assert "quant.model_id is required when quant is enabled for HF builds" in error_msg assert "compile.ep_config.provider is required" in error_msg def test_multiple_errors_collected_onnx_build(self) -> None: @@ -1861,17 +1861,17 @@ def test_multiple_errors_collected_onnx_build(self) -> None: loader=WinMLLoaderConfig(task=None), export=None, # ONNX build optim=None, - quant=WinMLQuantizationConfig(task=None, model_name=None), + quant=WinMLQuantizationConfig(task=None, model_id=None), compile=WinMLCompileConfig(ep_config=EPConfig(provider="")), ) with pytest.raises(ValueError, match="Invalid WinMLBuildConfig") as exc_info: config.validate() error_msg = str(exc_info.value) - # ONNX build: loader.task NOT required, quant.task/model_name NOT required + # ONNX build: loader.task NOT required, quant.task/model_id NOT required assert "loader.task" not in error_msg assert "quant.task" not in error_msg - assert "quant.model_name" not in error_msg + assert "quant.model_id" not in error_msg # These still apply assert "optim config is required" in error_msg assert "compile.ep_config.provider is required" in error_msg @@ -1924,6 +1924,22 @@ def test_default_unchanged(self) -> None: assert config.activation_type == "uint8" +class TestQuantModelId: + """Tests for the quant model_id field (renamed from model_name).""" + + def test_to_dict_emits_model_id_key(self) -> None: + """to_dict() serializes the HF model id under the 'model_id' key.""" + config = WinMLQuantizationConfig(model_id="microsoft/resnet-50") + data = config.to_dict() + assert data["model_id"] == "microsoft/resnet-50" + assert "model_name" not in data + + def test_from_dict_reads_model_id(self) -> None: + """from_dict() reads the canonical 'model_id' key.""" + config = WinMLQuantizationConfig.from_dict({"model_id": "microsoft/resnet-50"}) + assert config.model_id == "microsoft/resnet-50" + + # ============================================================================= # TestDevicePrecisionIntegration - device/precision in generate_build_config() # =============================================================================