Skip to content

fix(qa): ommit plugin creation if TensorRT branch is missed.#8783

Merged
mc-nv merged 5 commits into
mainfrom
mchornyi/TRI-1093/models-api
May 15, 2026
Merged

fix(qa): ommit plugin creation if TensorRT branch is missed.#8783
mc-nv merged 5 commits into
mainfrom
mchornyi/TRI-1093/models-api

Conversation

@mc-nv
Copy link
Copy Markdown
Contributor

@mc-nv mc-nv commented May 15, 2026

CI (internal): [#51427542](http://tritonserver.local/ci/pipelines/51427542)

What does the PR do?

Two TensorRT 11 model API removals broke CustomHardmax / QA model generation on R100 (Vera Rubin). This PR makes the QA model gen tolerate both API surfaces and gracefully skips plugin model gen when the upstream branch is not yet published:

  • BuilderFlag.REJECT_EMPTY_ALGORITHMS (removed in TRT 11) is now guarded with hasattr across all gen_qa_*.py callers; the flag is still applied on TRT <= 10.
  • gen_qa_trt_plugin_models.py picks the V2 or V3 plugin API at runtime (plugin_creator_list + add_plugin_v2 on TRT 10.x; all_creators + add_plugin_v3 with phase=BUILD on TRT 11).
  • gen_qa_model_repository clones the TensorRT samples from github.com/NVIDIA/TensorRT release/${TRT_BRANCH} matching the runtime TRT version. If the branch is not published yet (e.g. TRT 11 ahead of the public sync), plugin model generation is skipped with a [WARNING] instead of failing the whole job.

Checklist

  • PR title reflects the change and is of format <commit_type>: <Title>
  • Changes are described in the pull request.
  • Related issues are referenced.
  • Populated github labels field
  • Added test plan and verified test passes.
  • Verified that the PR passes existing CI.
  • Verified copyright is correct on all changed files.
  • Added succinct git squash message before merging.
  • All template sections are filled out.
  • Optional: Additional screenshots for behavior/output changes with before/after.

Commit Type:

  • fix

Related PRs:

Where should the reviewer start?

  • qa/common/gen_qa_trt_plugin_models.py -- the version-detect dispatch in get_trt_plugin() and the add_plugin_v{2,3} selector in create_plan_modelfile().
  • qa/common/gen_qa_model_repository (the rewritten plugin block) -- public-mirror clone with release/${TRT_BRANCH} and the skip-with-warning fallback.
  • All qa/common/gen_qa_*.py files: the uniform hasattr guard around REJECT_EMPTY_ALGORITHMS.

Test plan:

  1. Run GenModels-build pipeline on a TRT 10.x target (e.g. A100): plugin generates from release/10.x V2 source, model artifacts appear under qa_trt_plugin_model_repository.
  2. Run GenModels-build pipeline on R100 (TRT 11): plugin step skips with [WARNING] TensorRT release/11.0 not available on github.com/NVIDIA/TensorRT. Rest of the QA model repo still produces.
  3. Once NVIDIA publishes release/11.0 with the V3 port, R100 plugin gen resumes automatically via the V3 dispatch path -- no further code change needed.
  • CI Pipeline ID: 51427542

Caveats:

  • L0_trt_plugin will report missing plan_*_CustomHardmax artifacts on R100 until release/11.0 is published publicly on github.com/NVIDIA/TensorRT (the V3 port already exists on the internal mirror).

Background

GitLab job GenModels-build--x86-r100-10.7 (job 317056972) failed on trt.BuilderFlag.REJECT_EMPTY_ALGORITHMS -- removed in TRT 11. After guarding that, job 319122413 advanced and hit a second TRT 11 removal in the plugin path (IPluginRegistry.plugin_creator_list). Both are addressed here without breaking TRT 10.x.

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

  • Resolves: TRI-1093

mc-nv added 4 commits May 15, 2026 16:34
The upstream onnx_custom_plugin sample on github.com/NVIDIA/TensorRT
ships only the legacy IPluginV2DynamicExt implementation, which TRT 11
no longer loads -- the V2 plugin registry surface was removed. The
internal NVIDIA mirror has a V3 port cherry-picked to rel-11.0 while
rel-10.x branches keep V2.

- gen_qa_model_repository now clones the TensorRT samples from
  gitlab-master.nvidia.com/TensorRT/TensorRT at rel-${TRT_BRANCH}
  matching the runtime TRT version. If the branch is not available
  the plugin model generation is skipped with a warning so the rest
  of the QA model repository is still produced. CI_JOB_TOKEN is
  passed through the docker run that executes the TRT script.

- gen_qa_trt_plugin_models.py picks the V2 or V3 plugin API at
  runtime via hasattr (plugin_creator_list / add_plugin_v2 on TRT
  10.x with V2 source, all_creators / add_plugin_v3 with phase=BUILD
  on TRT 11 with V3 source).
Reverts the previous switch to the internal NVIDIA GitLab mirror.
The public github.com/NVIDIA/TensorRT remains the source of truth;
when a release/<major.minor> branch is not yet published for the
runtime TRT version (e.g. release/11.0 ahead of the public sync),
the plugin model generation skips with a warning instead of trying
the fallback ladder. L0_trt_plugin loses coverage for that TRT
version until the branch becomes available, but the rest of the
QA model repository still produces.

Also drops the CI_JOB_TOKEN passthrough on the TRT docker run since
the public clone needs no authentication.
@mc-nv mc-nv self-assigned this May 15, 2026
@mc-nv mc-nv marked this pull request as ready for review May 15, 2026 19:13
@mc-nv mc-nv requested review from pskiran1, whoisj and yinggeh May 15, 2026 19:15
Comment thread qa/common/gen_qa_model_repository
@mc-nv mc-nv merged commit 1e69d88 into main May 15, 2026
3 checks passed
@mc-nv mc-nv deleted the mchornyi/TRI-1093/models-api branch May 15, 2026 20:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants