fix(wheel): add platform tag, version, and PYPI_RELEASE support to tritonserver wheel#494
Closed
mc-nv wants to merge 8 commits into
Closed
fix(wheel): add platform tag, version, and PYPI_RELEASE support to tritonserver wheel#494mc-nv wants to merge 8 commits into
mc-nv wants to merge 8 commits into
Conversation
11 tasks
The tritonserver wheel was shipping as
"tritonserver-0.0.0-py3-none-any.whl": version fell back to the
setuptools 0.0.0 default because `dynamic = ["version"]` had no
resolver configured, and the platform tag was "none-any" despite the
wheel containing an arch-specific CPython extension
(tritonserver/_c/triton_bindings.*.so).
Changes:
- pyproject.toml: add [tool.setuptools.dynamic] version = {file =
"TRITON_VERSION"} so the declared dynamic version resolves from the
Triton release file already staged by the wheel build.
- python/setup.py: add a bdist_wheel override that sets
root_is_pure = False so setuptools tags the wheel with the current
build platform (e.g. linux_x86_64 / linux_aarch64) via its normal
auto-detection.
- python/build_wheel.py: copy TRITON_VERSION into the wheel build
directory so the dynamic-file version lookup succeeds at build time.
Refs: NVBug 6098081, JIRA DLIS-8648, Linear TRI-983
Raw linux_x86_64 / linux_aarch64 wheels are not accepted by canonical PyPI — the platform tag must be manylinux_2_X_<arch>. Port the pattern established for tritonclient in TRI-286: after the wheel is built, run `auditwheel repair` to auto-discover the minimum manylinux tag from the embedded triton_bindings.*.so glibc symbol dependencies, with a `python -m wheel tags --platform-tag manylinux_2_28_<arch>` fallback for the "no ELF" pure-Python case. When auditwheel is not available on PATH (e.g. local non-container builds), keep the linux_<arch> wheel and log a warning so builds do not regress; the Poetry / pip-tools lock-file problem is already solved by the distinct filename. Leaves a NOTE in setup.py: the embedded binding .so is CPython-ABI-specific, so the wheel will need cp<XY>-cp<XY> python+abi tags once consumers are ready to gate installs on the exact interpreter version. Refs: NVBug 6098081, JIRA DLIS-8648, Linear TRI-983, TRI-286
Adopt PEP 427's optional build-tag slot so two wheels of the same version (e.g. successive reruns of a CI pipeline) can coexist in the same index without filename collision. Preferred source is GitLab's CI_PIPELINE_ID with a BUILD_NUMBER fallback for other CI systems; both are guaranteed to start with a digit as required by PEP 427. The value is forwarded through `python -m build` to the setuptools backend's `bdist_wheel --build=<N>` (alias for --build-number) via the PEP 517 `-C--build-option=` config setting. Matches the build-tag slot already used by the RHEL .zip artifact naming convention in .gitlab-ci.yml. Build-arg handoff through build.py is a separate follow-up; this change is a no-op in local non-CI builds since neither env var is set. Refs: NVBug 6098081, JIRA DLIS-8648, Linear TRI-983
Add a _compose_version() helper in build_wheel.py that appends a PEP 440 local-version segment to the base TRITON_VERSION when the NVIDIA_UPSTREAM_VERSION and/or CUDA_VERSION env vars are set. This makes the wheel filename carry the same nv<X>.cu<Y> identifiers already used by the RHEL .zip artifact naming in .gitlab-ci.yml: tritonserver-<TRITON_VERSION>+nv<NV>.cu<CUDA>-<CI_PIPELINE_ID>-cp<XY>-cp<XY>-manylinux_2_28_<arch>.whl Local-version segments are informational and do not affect pip version comparison, so existing pins like `tritonserver==2.69.0` continue to match. The helper is a no-op when neither env var is set (local non-CI builds). Env-var propagation from the CI runner into the build container is handled by the paired server PR's change to build.py's docker-run invocation. Refs: NVBug 6098081, JIRA DLIS-8648, Linear TRI-983, TRI-286
…TRI-983) Mirror the server-side refinement for the tritonserver wheel: 1. Use NVIDIA_BUILD_ID (from --build-id) as the PEP 427 build-tag source instead of a separate CI_PIPELINE_ID / BUILD_NUMBER env var, aligning the wheel filename with the existing Triton convention already used throughout the build system. 2. Detect CUDA_VERSION from the container-local env with a /usr/local/cuda/version.json fallback (canonical location for the installed toolkit), since CUDA_VERSION cannot be propagated from the host — only the container has it set. Refs: NVBug 6098081, JIRA DLIS-8648, Linear TRI-983
Pipeline 49141836 / job 302710609 failed with:
auditwheel.main_repair: Invalid binary wheel, found the following
shared library/libraries in purelib folder:
triton_bindings.cpython-312-x86_64-linux-gnu.so
The wheel has to be platlib compliant in order to be repaired by
auditwheel.
The wheel's filename was correctly tagged
(tritonserver-<ver>-<build>-cp312-cp312-linux_x86_64.whl), but the
WHEEL metadata inside still declared Root-Is-Purelib: true, so
auditwheel rejected the repair because finding a .so in the purelib
tree is an inconsistent state.
Root cause: the previous bdist_wheel override (via the `wheel`
package) set `self.root_is_pure = False`, but modern setuptools
(>=70) provides its own setuptools.command.bdist_wheel and ignores
overrides registered against wheel.bdist_wheel. The platform-tag
part still worked because it is derived from the `has_ext_modules`
check on the Distribution, not from root_is_pure — but the
Root-Is-Purelib metadata flag was never flipped.
Fix: subclass Distribution and override has_ext_modules() to return
True. This is the canonical way to tell setuptools the wheel is
binary without having to declare a dummy ext_module or trigger a
compilation step. setuptools then:
- sets WHEEL Root-Is-Purelib: false (required for auditwheel repair),
- auto-derives cp<XY>-cp<XY>-linux_<arch> tags from the current
interpreter and sysconfig.get_platform(), matching the filename
we already wanted.
Drop the now-unused wheel.bdist_wheel override block and register
the BinaryDistribution via setup(distclass=...).
Refs: NVBug 6098081, JIRA DLIS-8648, Linear TRI-983
Mirror the tritonfrontend setup.py fix: `_compose_version()` now consults three env sources for the +nv<X> local-version segment (NVIDIA_UPSTREAM_VERSION, NVIDIA_TRITON_SERVER_VERSION, TRITON_CONTAINER_VERSION) and prints the resolved inputs to stderr so any future gap in the env propagation chain is self-announcing in the wheel-build log rather than silently producing a wheel without the expected local-version suffix. NVIDIA_TRITON_SERVER_VERSION and TRITON_CONTAINER_VERSION are set as ENV in the buildbase image itself (via the TRITON_CONTAINER_VERSION ARG -> ENV wiring), so they survive even when the docker-run `-e NVIDIA_UPSTREAM_VERSION=<value>` forwarding does not reach the container (e.g. when FLAGS.upstream_container_version evaluates to empty on the host). Refs: Linear TRI-983
Mirror the tritonfrontend change in core/python/build_wheel.py: prefer CI_PIPELINE_ID over NVIDIA_BUILD_ID, falling back to BUILD_NUMBER. Filter the "<unknown>" default build.py emits for local builds without --build-id. Added stderr diagnostic. Refs: Linear TRI-983
7605c1c to
29c962c
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does the PR do?
Fixes the
tritonserverPython wheel so it carries the correct platform tag, version, and CUDA/container-release metadata in its filename.Previously,
python -m buildread the version from theTRITON_VERSIONfile viapyproject.toml, ignoring theVERSIONenv var — the composed+nv<X>.cu<Y>local-version suffix was silently dropped. This patch overwrites the copiedTRITON_VERSIONfile in the wheel build root with the composed version before invokingpython -m build.Changes:
+nv<upstream>.cu<cuda>local-version suffix) toTRITON_VERSIONin the wheel build dir beforepython -m build, sopyproject.toml'sversion = {file = "TRITON_VERSION"}picks it up.PYPI_RELEASEenv-var support: whentrue, strips local-version suffix and omits build tag for PyPI-compatible wheel names.CI_PIPELINE_ID→NVIDIA_BUILD_ID→BUILD_NUMBERas the PEP 427 build tag._detect_cuda_version()readsCUDA_VERSIONenv var or/usr/local/cuda/version.json.Checklist
<commit_type>: <Title>Commit Type:
Related PRs:
triton-inference-server/server#8752
dl/dgx/tritonserver!1745
Where should the reviewer start?
python/build_wheel.py— the version-composition logic and theTRITON_VERSIONoverwrite beforepython -m build.Test plan:
- CI Pipeline ID: 49163608
CI (internal): [#49163608](http://tritonserver.local/ci/pipelines/49163608)Caveats:
PYPI_RELEASE=trueproduces PyPI-compatible names; wheels are not yet published to PyPI (tracked separately).Background
pyproject.tomldeclaresversion = {file = "TRITON_VERSION"}. TheVERSIONenv var set for thesetup.pypath has no effect onpython -m build.Related Issues: