Skip to content

fix: tag tritonserver wheel with arch-specific platform tag #495

Merged
mc-nv merged 12 commits into
mainfrom
mchornyi/TRI-983-wheel-tag
May 5, 2026
Merged

fix: tag tritonserver wheel with arch-specific platform tag #495
mc-nv merged 12 commits into
mainfrom
mchornyi/TRI-983-wheel-tag

Conversation

@mc-nv
Copy link
Copy Markdown
Contributor

@mc-nv mc-nv commented Apr 29, 2026

What does the PR do?

Fix tritonserver Python wheel so it ships with a correct architecture-specific
platform tag instead of py3-none-any, and with the proper Triton release version
instead of the hard-coded 0.0.0.

Changes across 8 commits:

  • core/python/setup.py — add BinaryDistribution.has_ext_modules()=True so
    setuptools emits a platlib wheel (Root-Is-Purelib: false, cpXY-cpXY-linux_<arch>)
    rather than py3-none-any.
  • core/python/build_wheel.py — add _repair_wheel_with_auditwheel() to upgrade the
    wheel to manylinux_2_X_<arch>; add _compose_version() to append a
    +nv{release}.cu{cudaXY} local-version segment; source build tag from
    CI_PIPELINE_ID / NVIDIA_BUILD_ID / BUILD_NUMBER.
  • core/pyproject.toml — dynamic version resolved from the TRITON_VERSION file at
    build time (fixes the 0.0.0 regression).
Resolves: TRI-983
Related: triton-inference-server/server#8761

Where should the reviewer start?

core/python/build_wheel.py — the _repair_wheel_with_auditwheel() and
_compose_version() functions; core/python/setup.pyBinaryDistribution.

Test plan:

Build inside the SDK container (requires auditwheel in the image — added via
server#8743):

python3 core/python/build_wheel.py \
  --dest-dir /tmp/whl \
  --binding-path <path-to-triton_bindings.so>
ls /tmp/whl/              # manylinux wheel
ls /tmp/whl/wheel/dist/   # linux_<arch> wheel (for container install)
python3 -m wheel tags --list /tmp/whl/wheel/dist/*.whl
# Must NOT be py3-none-any
  • CI Pipeline ID:

Caveats:

PyPI publishing CI jobs (py3-tritonserver-publish) are a follow-up task.
The container image continues to receive the linux_<arch> wheel from
generic/wheel/dist/; the manylinux wheel is written to generic/ for future
PyPI upload.

Related Issues:

  • Resolves: TRI-983
  • NVBug: 6098081 / JIRA: DLIS-8648

mc-nv added 8 commits April 29, 2026 09:50
The tritonserver wheel was shipping as
"tritonserver-0.0.0-py3-none-any.whl": version fell back to the
setuptools 0.0.0 default because `dynamic = ["version"]` had no
resolver configured, and the platform tag was "none-any" despite the
wheel containing an arch-specific CPython extension
(tritonserver/_c/triton_bindings.*.so).

Changes:
- pyproject.toml: add [tool.setuptools.dynamic] version = {file =
  "TRITON_VERSION"} so the declared dynamic version resolves from the
  Triton release file already staged by the wheel build.
- python/setup.py: add a bdist_wheel override that sets
  root_is_pure = False so setuptools tags the wheel with the current
  build platform (e.g. linux_x86_64 / linux_aarch64) via its normal
  auto-detection.
- python/build_wheel.py: copy TRITON_VERSION into the wheel build
  directory so the dynamic-file version lookup succeeds at build time.

Refs: NVBug 6098081, JIRA DLIS-8648, Linear TRI-983
Raw linux_x86_64 / linux_aarch64 wheels are not accepted by canonical
PyPI — the platform tag must be manylinux_2_X_<arch>. Port the pattern
established for tritonclient in TRI-286: after the wheel is built,
run `auditwheel repair` to auto-discover the minimum manylinux tag
from the embedded triton_bindings.*.so glibc symbol dependencies,
with a `python -m wheel tags --platform-tag manylinux_2_28_<arch>`
fallback for the "no ELF" pure-Python case.

When auditwheel is not available on PATH (e.g. local non-container
builds), keep the linux_<arch> wheel and log a warning so builds do
not regress; the Poetry / pip-tools lock-file problem is already
solved by the distinct filename.

Leaves a NOTE in setup.py: the embedded binding .so is
CPython-ABI-specific, so the wheel will need cp<XY>-cp<XY>
python+abi tags once consumers are ready to gate installs on the
exact interpreter version.

Refs: NVBug 6098081, JIRA DLIS-8648, Linear TRI-983, TRI-286
Adopt PEP 427's optional build-tag slot so two wheels of the same
version (e.g. successive reruns of a CI pipeline) can coexist in the
same index without filename collision. Preferred source is GitLab's
CI_PIPELINE_ID with a BUILD_NUMBER fallback for other CI systems;
both are guaranteed to start with a digit as required by PEP 427.

The value is forwarded through `python -m build` to the setuptools
backend's `bdist_wheel --build=<N>` (alias for --build-number) via
the PEP 517 `-C--build-option=` config setting.

Matches the build-tag slot already used by the RHEL .zip artifact
naming convention in .gitlab-ci.yml. Build-arg handoff through
build.py is a separate follow-up; this change is a no-op in local
non-CI builds since neither env var is set.

Refs: NVBug 6098081, JIRA DLIS-8648, Linear TRI-983
Add a _compose_version() helper in build_wheel.py that appends a
PEP 440 local-version segment to the base TRITON_VERSION when the
NVIDIA_UPSTREAM_VERSION and/or CUDA_VERSION env vars are set. This
makes the wheel filename carry the same nv<X>.cu<Y> identifiers
already used by the RHEL .zip artifact naming in .gitlab-ci.yml:

  tritonserver-<TRITON_VERSION>+nv<NV>.cu<CUDA>-<CI_PIPELINE_ID>-cp<XY>-cp<XY>-manylinux_2_28_<arch>.whl

Local-version segments are informational and do not affect pip
version comparison, so existing pins like `tritonserver==2.69.0`
continue to match. The helper is a no-op when neither env var is
set (local non-CI builds). Env-var propagation from the CI runner
into the build container is handled by the paired server PR's
change to build.py's docker-run invocation.

Refs: NVBug 6098081, JIRA DLIS-8648, Linear TRI-983, TRI-286
…TRI-983)

Mirror the server-side refinement for the tritonserver wheel:

1. Use NVIDIA_BUILD_ID (from --build-id) as the PEP 427 build-tag
   source instead of a separate CI_PIPELINE_ID / BUILD_NUMBER env
   var, aligning the wheel filename with the existing Triton
   convention already used throughout the build system.
2. Detect CUDA_VERSION from the container-local env with a
   /usr/local/cuda/version.json fallback (canonical location for
   the installed toolkit), since CUDA_VERSION cannot be propagated
   from the host — only the container has it set.

Refs: NVBug 6098081, JIRA DLIS-8648, Linear TRI-983
Pipeline 49141836 / job 302710609 failed with:

  auditwheel.main_repair: Invalid binary wheel, found the following
  shared library/libraries in purelib folder:
      triton_bindings.cpython-312-x86_64-linux-gnu.so
  The wheel has to be platlib compliant in order to be repaired by
  auditwheel.

The wheel's filename was correctly tagged
(tritonserver-<ver>-<build>-cp312-cp312-linux_x86_64.whl), but the
WHEEL metadata inside still declared Root-Is-Purelib: true, so
auditwheel rejected the repair because finding a .so in the purelib
tree is an inconsistent state.

Root cause: the previous bdist_wheel override (via the `wheel`
package) set `self.root_is_pure = False`, but modern setuptools
(>=70) provides its own setuptools.command.bdist_wheel and ignores
overrides registered against wheel.bdist_wheel. The platform-tag
part still worked because it is derived from the `has_ext_modules`
check on the Distribution, not from root_is_pure — but the
Root-Is-Purelib metadata flag was never flipped.

Fix: subclass Distribution and override has_ext_modules() to return
True. This is the canonical way to tell setuptools the wheel is
binary without having to declare a dummy ext_module or trigger a
compilation step. setuptools then:

  - sets WHEEL Root-Is-Purelib: false (required for auditwheel repair),
  - auto-derives cp<XY>-cp<XY>-linux_<arch> tags from the current
    interpreter and sysconfig.get_platform(), matching the filename
    we already wanted.

Drop the now-unused wheel.bdist_wheel override block and register
the BinaryDistribution via setup(distclass=...).

Refs: NVBug 6098081, JIRA DLIS-8648, Linear TRI-983
Mirror the tritonfrontend setup.py fix: `_compose_version()` now
consults three env sources for the +nv<X> local-version segment
(NVIDIA_UPSTREAM_VERSION, NVIDIA_TRITON_SERVER_VERSION,
TRITON_CONTAINER_VERSION) and prints the resolved inputs to stderr
so any future gap in the env propagation chain is self-announcing
in the wheel-build log rather than silently producing a wheel
without the expected local-version suffix.

NVIDIA_TRITON_SERVER_VERSION and TRITON_CONTAINER_VERSION are set
as ENV in the buildbase image itself (via the TRITON_CONTAINER_VERSION
ARG -> ENV wiring), so they survive even when the docker-run `-e
NVIDIA_UPSTREAM_VERSION=<value>` forwarding does not reach the
container (e.g. when FLAGS.upstream_container_version evaluates to
empty on the host).

Refs: Linear TRI-983
Mirror the tritonfrontend change in core/python/build_wheel.py:
prefer CI_PIPELINE_ID over NVIDIA_BUILD_ID, falling back to
BUILD_NUMBER. Filter the "<unknown>" default build.py emits for
local builds without --build-id. Added stderr diagnostic.

Refs: Linear TRI-983
@mc-nv mc-nv self-assigned this Apr 29, 2026
@mc-nv mc-nv marked this pull request as ready for review April 29, 2026 20:03
…I-983)

pyproject.toml reads the wheel version from the TRITON_VERSION file via
`version = {file = "TRITON_VERSION"}`. The previous code copied the raw
source TRITON_VERSION (e.g. "2.69.0dev") into the wheel build root, so
the +nv…cu… local segment computed by _compose_version() was never
embedded in the wheel — it was set in a VERSION env var that
`python3 -m build` never reads.

Fix: compute the composed version before os.chdir and write it directly
to the wheel build root's TRITON_VERSION. The source-tree file is left
unchanged. Also remove the now-dead wenv["VERSION"] assignment.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the Python packaging/build pipeline for tritonserver so produced wheels are correctly treated as platform-specific (not py3-none-any) and embed the proper Triton release version (instead of 0.0.0), with optional manylinux repair for PyPI distribution.

Changes:

  • Mark the wheel as non-pure by using a custom Distribution (has_ext_modules()=True) so setuptools emits an arch-specific platform tag.
  • Add wheel post-processing via auditwheel repair and introduce composed versioning with a +nv…cu… local version segment.
  • Switch pyproject.toml to resolve the wheel version dynamically from a TRITON_VERSION file at build time.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
python/setup.py Forces setuptools to build a platform wheel by signaling the presence of extension modules.
python/build_wheel.py Adds CUDA detection + version composition and runs auditwheel repair (with fallbacks) to produce manylinux wheels.
pyproject.toml Uses setuptools dynamic versioning from TRITON_VERSION to avoid 0.0.0 wheels.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread python/build_wheel.py
Comment thread python/build_wheel.py Outdated
Comment thread python/build_wheel.py
Comment on lines +199 to +221
arch = os.uname().machine
manylinux_tag = f"manylinux_2_28_{arch}"
print(
f"=== Pure-Python wheel detected; falling back to wheel tags "
f"({manylinux_tag})"
)
copied = os.path.join(dest_dir, os.path.basename(wheel_path))
shutil.copy(wheel_path, copied)
# `wheel tags --remove` replaces the linux_<arch> wheel in
# dest_dir with the correctly-tagged manylinux one.
r2 = subprocess.run(
[
"python3",
"-m",
"wheel",
"tags",
"--platform-tag",
manylinux_tag,
"--remove",
copied,
]
)
fail_if(r2.returncode != 0, "wheel tags fallback failed")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interesting 🤔

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fall back tend to mitigate current behavior where using pure python tag (-any) for the wheels that has extension.

# Current behaviour: 
Processing ./python/tritonserver-0.0.0-py3-none-any.whl (from tritonserver==0.0.0)
Processing ./python/tritonfrontend-2.69.0.dev0-py3-none-any.whl (from tritonfrontend==2.69.0.dev0)

Hard to say what is the right way here, but we falling back on hardcoded manylinux_2_28 for those wheels as they have binary extension.

Comment thread python/build_wheel.py Outdated
mc-nv and others added 2 commits April 29, 2026 18:17
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

@whoisj whoisj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but I have a couple of questions before I approve.

Comment thread python/build_wheel.py Outdated
Comment thread python/build_wheel.py
Comment thread python/build_wheel.py
Comment on lines +199 to +221
arch = os.uname().machine
manylinux_tag = f"manylinux_2_28_{arch}"
print(
f"=== Pure-Python wheel detected; falling back to wheel tags "
f"({manylinux_tag})"
)
copied = os.path.join(dest_dir, os.path.basename(wheel_path))
shutil.copy(wheel_path, copied)
# `wheel tags --remove` replaces the linux_<arch> wheel in
# dest_dir with the correctly-tagged manylinux one.
r2 = subprocess.run(
[
"python3",
"-m",
"wheel",
"tags",
"--platform-tag",
manylinux_tag,
"--remove",
copied,
]
)
fail_if(r2.returncode != 0, "wheel tags fallback failed")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interesting 🤔

@mc-nv mc-nv requested a review from whoisj May 4, 2026 18:21
whoisj
whoisj previously approved these changes May 4, 2026
Copy link
Copy Markdown
Contributor

@whoisj whoisj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mc-nv mc-nv changed the title fix: tag tritonserver wheel with arch-specific platform tag (TRI-983) fix: tag tritonserver wheel with arch-specific platform tag May 4, 2026
yinggeh
yinggeh previously approved these changes May 5, 2026
@mc-nv mc-nv dismissed stale reviews from yinggeh and whoisj via 97e295f May 5, 2026 01:18
@mc-nv mc-nv requested a review from whoisj May 5, 2026 01:18
@mc-nv mc-nv requested a review from yinggeh May 5, 2026 01:18
@mc-nv mc-nv merged commit e6fcb0c into main May 5, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

4 participants