fix(qa): ommit plugin creation if TensorRT branch is missed. by mc-nv · Pull Request #8783 · triton-inference-server/server

mc-nv · 2026-05-15T19:09:42Z

^{CI (internal): [#51427542](http://tritonserver.local/ci/pipelines/51427542)}

What does the PR do?

Two TensorRT 11 model API removals broke CustomHardmax / QA model generation on R100 (Vera Rubin). This PR makes the QA model gen tolerate both API surfaces and gracefully skips plugin model gen when the upstream branch is not yet published:

BuilderFlag.REJECT_EMPTY_ALGORITHMS (removed in TRT 11) is now guarded with hasattr across all gen_qa_*.py callers; the flag is still applied on TRT <= 10.
gen_qa_trt_plugin_models.py picks the V2 or V3 plugin API at runtime (plugin_creator_list + add_plugin_v2 on TRT 10.x; all_creators + add_plugin_v3 with phase=BUILD on TRT 11).
gen_qa_model_repository clones the TensorRT samples from github.com/NVIDIA/TensorRT release/${TRT_BRANCH} matching the runtime TRT version. If the branch is not published yet (e.g. TRT 11 ahead of the public sync), plugin model generation is skipped with a [WARNING] instead of failing the whole job.

Checklist

Commit Type:

fix

Related PRs:

Where should the reviewer start?

qa/common/gen_qa_trt_plugin_models.py -- the version-detect dispatch in get_trt_plugin() and the add_plugin_v{2,3} selector in create_plan_modelfile().
qa/common/gen_qa_model_repository (the rewritten plugin block) -- public-mirror clone with release/${TRT_BRANCH} and the skip-with-warning fallback.
All qa/common/gen_qa_*.py files: the uniform hasattr guard around REJECT_EMPTY_ALGORITHMS.

Test plan:

Run GenModels-build pipeline on a TRT 10.x target (e.g. A100): plugin generates from release/10.x V2 source, model artifacts appear under qa_trt_plugin_model_repository.
Run GenModels-build pipeline on R100 (TRT 11): plugin step skips with [WARNING] TensorRT release/11.0 not available on github.com/NVIDIA/TensorRT. Rest of the QA model repo still produces.
Once NVIDIA publishes release/11.0 with the V3 port, R100 plugin gen resumes automatically via the V3 dispatch path -- no further code change needed.

CI Pipeline ID: 51427542

Caveats:

L0_trt_plugin will report missing plan_*_CustomHardmax artifacts on R100 until release/11.0 is published publicly on github.com/NVIDIA/TensorRT (the V3 port already exists on the internal mirror).

Background

GitLab job GenModels-build--x86-r100-10.7 (job 317056972) failed on trt.BuilderFlag.REJECT_EMPTY_ALGORITHMS -- removed in TRT 11. After guarding that, job 319122413 advanced and hit a second TRT 11 removal in the plugin path (IPluginRegistry.plugin_creator_list). Both are addressed here without breaking TRT 10.x.

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

Resolves: TRI-1093

The upstream onnx_custom_plugin sample on github.com/NVIDIA/TensorRT ships only the legacy IPluginV2DynamicExt implementation, which TRT 11 no longer loads -- the V2 plugin registry surface was removed. The internal NVIDIA mirror has a V3 port cherry-picked to rel-11.0 while rel-10.x branches keep V2. - gen_qa_model_repository now clones the TensorRT samples from gitlab-master.nvidia.com/TensorRT/TensorRT at rel-${TRT_BRANCH} matching the runtime TRT version. If the branch is not available the plugin model generation is skipped with a warning so the rest of the QA model repository is still produced. CI_JOB_TOKEN is passed through the docker run that executes the TRT script. - gen_qa_trt_plugin_models.py picks the V2 or V3 plugin API at runtime via hasattr (plugin_creator_list / add_plugin_v2 on TRT 10.x with V2 source, all_creators / add_plugin_v3 with phase=BUILD on TRT 11 with V3 source).

Reverts the previous switch to the internal NVIDIA GitLab mirror. The public github.com/NVIDIA/TensorRT remains the source of truth; when a release/<major.minor> branch is not yet published for the runtime TRT version (e.g. release/11.0 ahead of the public sync), the plugin model generation skips with a warning instead of trying the fallback ladder. L0_trt_plugin loses coverage for that TRT version until the branch becomes available, but the rest of the QA model repository still produces. Also drops the CI_JOB_TOKEN passthrough on the TRT docker run since the public clone needs no authentication.

mc-nv added 4 commits May 15, 2026 16:34

Addressing TensorRT API deprecation

8da633c

Wrap bash variables in CustomHardmax plugin block as ${VAR}

c44f888

mc-nv self-assigned this May 15, 2026

mc-nv marked this pull request as ready for review May 15, 2026 19:13

mc-nv requested review from pskiran1, whoisj and yinggeh May 15, 2026 19:15

Address pre-commit issue

0d1b752

yinggeh approved these changes May 15, 2026

View reviewed changes

whoisj approved these changes May 15, 2026

View reviewed changes

Comment thread qa/common/gen_qa_model_repository

mc-nv merged commit 1e69d88 into main May 15, 2026
3 checks passed

mc-nv deleted the mchornyi/TRI-1093/models-api branch May 15, 2026 20:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(qa): ommit plugin creation if TensorRT branch is missed.#8783

fix(qa): ommit plugin creation if TensorRT branch is missed.#8783
mc-nv merged 5 commits into
mainfrom
mchornyi/TRI-1093/models-api

mc-nv commented May 15, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

3 participants

Conversation

mc-nv commented May 15, 2026

What does the PR do?

Checklist

Commit Type:

Related PRs:

Where should the reviewer start?

Test plan:

Caveats:

Background

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

3 participants