Skip to content

fix(wheel): add platform tag, version, and PYPI_RELEASE support to tritonserver wheel#494

Closed
mc-nv wants to merge 8 commits into
mainfrom
mchornyi/TRI-983-tritonserver-wheel-metadata
Closed

fix(wheel): add platform tag, version, and PYPI_RELEASE support to tritonserver wheel#494
mc-nv wants to merge 8 commits into
mainfrom
mchornyi/TRI-983-tritonserver-wheel-metadata

Conversation

@mc-nv
Copy link
Copy Markdown
Contributor

@mc-nv mc-nv commented Apr 21, 2026

What does the PR do?

Fixes the tritonserver Python wheel so it carries the correct platform tag, version, and CUDA/container-release metadata in its filename.

Previously, python -m build read the version from the TRITON_VERSION file via pyproject.toml, ignoring the VERSION env var — the composed +nv<X>.cu<Y> local-version suffix was silently dropped. This patch overwrites the copied TRITON_VERSION file in the wheel build root with the composed version before invoking python -m build.

Changes:

  • Write composed version (with +nv<upstream>.cu<cuda> local-version suffix) to TRITON_VERSION in the wheel build dir before python -m build, so pyproject.toml's version = {file = "TRITON_VERSION"} picks it up.
  • Add PYPI_RELEASE env-var support: when true, strips local-version suffix and omits build tag for PyPI-compatible wheel names.
  • Use CI_PIPELINE_IDNVIDIA_BUILD_IDBUILD_NUMBER as the PEP 427 build tag.
  • _detect_cuda_version() reads CUDA_VERSION env var or /usr/local/cuda/version.json.

Checklist

  • PR title reflects the change and is of format <commit_type>: <Title>
  • Changes are described in the pull request.
  • Related issues are referenced.
  • Populated github labels field
  • Added test plan and verified test passes.
  • Verified that the PR passes existing CI.
  • Verified copyright is correct on all changed files.
  • Added succinct git squash message before merging.
  • All template sections are filled out.

Commit Type:

  • fix

Related PRs:

triton-inference-server/server#8752
dl/dgx/tritonserver!1745

Where should the reviewer start?

python/build_wheel.py — the version-composition logic and the TRITON_VERSION overwrite before python -m build.

Test plan:

  • CI Pipeline ID: 49163608
CI (internal): [#49163608](http://tritonserver.local/ci/pipelines/49163608)

Caveats:

PYPI_RELEASE=true produces PyPI-compatible names; wheels are not yet published to PyPI (tracked separately).

Background

pyproject.toml declares version = {file = "TRITON_VERSION"}. The VERSION env var set for the setup.py path has no effect on python -m build.

Related Issues:

  • Resolves: TRI-983

@mc-nv mc-nv self-assigned this Apr 21, 2026
@mc-nv mc-nv changed the title fix: wire tritonserver wheel version, platform, auditwheel, and build number (TRI-983) fix(wheel): add platform tag, version, and PYPI_RELEASE support to tritonserver wheel Apr 22, 2026
mc-nv added 8 commits April 29, 2026 09:50
The tritonserver wheel was shipping as
"tritonserver-0.0.0-py3-none-any.whl": version fell back to the
setuptools 0.0.0 default because `dynamic = ["version"]` had no
resolver configured, and the platform tag was "none-any" despite the
wheel containing an arch-specific CPython extension
(tritonserver/_c/triton_bindings.*.so).

Changes:
- pyproject.toml: add [tool.setuptools.dynamic] version = {file =
  "TRITON_VERSION"} so the declared dynamic version resolves from the
  Triton release file already staged by the wheel build.
- python/setup.py: add a bdist_wheel override that sets
  root_is_pure = False so setuptools tags the wheel with the current
  build platform (e.g. linux_x86_64 / linux_aarch64) via its normal
  auto-detection.
- python/build_wheel.py: copy TRITON_VERSION into the wheel build
  directory so the dynamic-file version lookup succeeds at build time.

Refs: NVBug 6098081, JIRA DLIS-8648, Linear TRI-983
Raw linux_x86_64 / linux_aarch64 wheels are not accepted by canonical
PyPI — the platform tag must be manylinux_2_X_<arch>. Port the pattern
established for tritonclient in TRI-286: after the wheel is built,
run `auditwheel repair` to auto-discover the minimum manylinux tag
from the embedded triton_bindings.*.so glibc symbol dependencies,
with a `python -m wheel tags --platform-tag manylinux_2_28_<arch>`
fallback for the "no ELF" pure-Python case.

When auditwheel is not available on PATH (e.g. local non-container
builds), keep the linux_<arch> wheel and log a warning so builds do
not regress; the Poetry / pip-tools lock-file problem is already
solved by the distinct filename.

Leaves a NOTE in setup.py: the embedded binding .so is
CPython-ABI-specific, so the wheel will need cp<XY>-cp<XY>
python+abi tags once consumers are ready to gate installs on the
exact interpreter version.

Refs: NVBug 6098081, JIRA DLIS-8648, Linear TRI-983, TRI-286
Adopt PEP 427's optional build-tag slot so two wheels of the same
version (e.g. successive reruns of a CI pipeline) can coexist in the
same index without filename collision. Preferred source is GitLab's
CI_PIPELINE_ID with a BUILD_NUMBER fallback for other CI systems;
both are guaranteed to start with a digit as required by PEP 427.

The value is forwarded through `python -m build` to the setuptools
backend's `bdist_wheel --build=<N>` (alias for --build-number) via
the PEP 517 `-C--build-option=` config setting.

Matches the build-tag slot already used by the RHEL .zip artifact
naming convention in .gitlab-ci.yml. Build-arg handoff through
build.py is a separate follow-up; this change is a no-op in local
non-CI builds since neither env var is set.

Refs: NVBug 6098081, JIRA DLIS-8648, Linear TRI-983
Add a _compose_version() helper in build_wheel.py that appends a
PEP 440 local-version segment to the base TRITON_VERSION when the
NVIDIA_UPSTREAM_VERSION and/or CUDA_VERSION env vars are set. This
makes the wheel filename carry the same nv<X>.cu<Y> identifiers
already used by the RHEL .zip artifact naming in .gitlab-ci.yml:

  tritonserver-<TRITON_VERSION>+nv<NV>.cu<CUDA>-<CI_PIPELINE_ID>-cp<XY>-cp<XY>-manylinux_2_28_<arch>.whl

Local-version segments are informational and do not affect pip
version comparison, so existing pins like `tritonserver==2.69.0`
continue to match. The helper is a no-op when neither env var is
set (local non-CI builds). Env-var propagation from the CI runner
into the build container is handled by the paired server PR's
change to build.py's docker-run invocation.

Refs: NVBug 6098081, JIRA DLIS-8648, Linear TRI-983, TRI-286
…TRI-983)

Mirror the server-side refinement for the tritonserver wheel:

1. Use NVIDIA_BUILD_ID (from --build-id) as the PEP 427 build-tag
   source instead of a separate CI_PIPELINE_ID / BUILD_NUMBER env
   var, aligning the wheel filename with the existing Triton
   convention already used throughout the build system.
2. Detect CUDA_VERSION from the container-local env with a
   /usr/local/cuda/version.json fallback (canonical location for
   the installed toolkit), since CUDA_VERSION cannot be propagated
   from the host — only the container has it set.

Refs: NVBug 6098081, JIRA DLIS-8648, Linear TRI-983
Pipeline 49141836 / job 302710609 failed with:

  auditwheel.main_repair: Invalid binary wheel, found the following
  shared library/libraries in purelib folder:
      triton_bindings.cpython-312-x86_64-linux-gnu.so
  The wheel has to be platlib compliant in order to be repaired by
  auditwheel.

The wheel's filename was correctly tagged
(tritonserver-<ver>-<build>-cp312-cp312-linux_x86_64.whl), but the
WHEEL metadata inside still declared Root-Is-Purelib: true, so
auditwheel rejected the repair because finding a .so in the purelib
tree is an inconsistent state.

Root cause: the previous bdist_wheel override (via the `wheel`
package) set `self.root_is_pure = False`, but modern setuptools
(>=70) provides its own setuptools.command.bdist_wheel and ignores
overrides registered against wheel.bdist_wheel. The platform-tag
part still worked because it is derived from the `has_ext_modules`
check on the Distribution, not from root_is_pure — but the
Root-Is-Purelib metadata flag was never flipped.

Fix: subclass Distribution and override has_ext_modules() to return
True. This is the canonical way to tell setuptools the wheel is
binary without having to declare a dummy ext_module or trigger a
compilation step. setuptools then:

  - sets WHEEL Root-Is-Purelib: false (required for auditwheel repair),
  - auto-derives cp<XY>-cp<XY>-linux_<arch> tags from the current
    interpreter and sysconfig.get_platform(), matching the filename
    we already wanted.

Drop the now-unused wheel.bdist_wheel override block and register
the BinaryDistribution via setup(distclass=...).

Refs: NVBug 6098081, JIRA DLIS-8648, Linear TRI-983
Mirror the tritonfrontend setup.py fix: `_compose_version()` now
consults three env sources for the +nv<X> local-version segment
(NVIDIA_UPSTREAM_VERSION, NVIDIA_TRITON_SERVER_VERSION,
TRITON_CONTAINER_VERSION) and prints the resolved inputs to stderr
so any future gap in the env propagation chain is self-announcing
in the wheel-build log rather than silently producing a wheel
without the expected local-version suffix.

NVIDIA_TRITON_SERVER_VERSION and TRITON_CONTAINER_VERSION are set
as ENV in the buildbase image itself (via the TRITON_CONTAINER_VERSION
ARG -> ENV wiring), so they survive even when the docker-run `-e
NVIDIA_UPSTREAM_VERSION=<value>` forwarding does not reach the
container (e.g. when FLAGS.upstream_container_version evaluates to
empty on the host).

Refs: Linear TRI-983
Mirror the tritonfrontend change in core/python/build_wheel.py:
prefer CI_PIPELINE_ID over NVIDIA_BUILD_ID, falling back to
BUILD_NUMBER. Filter the "<unknown>" default build.py emits for
local builds without --build-id. Added stderr diagnostic.

Refs: Linear TRI-983
@mc-nv mc-nv force-pushed the mchornyi/TRI-983-tritonserver-wheel-metadata branch from 7605c1c to 29c962c Compare April 29, 2026 16:55
@mc-nv mc-nv closed this Apr 29, 2026
@mc-nv mc-nv deleted the mchornyi/TRI-983-tritonserver-wheel-metadata branch April 29, 2026 16:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant