build(docker): reorg into docker/ and add cross-arch wheel builder#392
build(docker): reorg into docker/ and add cross-arch wheel builder#392saturley-hall wants to merge 8 commits into
Conversation
When set, setup.py returns an empty ext_modules so bdist_wheel produces a genuine py3-none-any wheel instead of a platform-tagged wheel containing no compiled code. Useful for producing a single cross-platform artifact when the VMM arena fast path is not required. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Move root Dockerfile and docker-compose.yml under docker/ so the
container assets live together. Update docker-compose context to ".."
and refresh README, CONTRIBUTING, docs/DEPLOYMENT and docs/ARCHITECTURE
for the new paths.
Add docker/Dockerfile.client-wheel: multi-arch (linux/amd64,
linux/arm64) builder on quay.io/pypa/manylinux_2_28_{x86_64,aarch64}
that produces, per target platform:
- four manylinux_2_28 wheels (cp310, cp311, cp312, cp313), each with
the modelexpress.vmm._alloc_ext .so compiled against the matching
CPython ABI and hardened with auditwheel repair;
- one py3-none-any pure-Python wheel built with MX_SKIP_EXT=1 (no
compiled extension);
- one sdist tarball.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
WalkthroughDocker configuration files move into a dedicated ChangesDocker Infrastructure Reorganization and Python Wheel Building
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docker/Dockerfile.client-wheel`:
- Around line 36-37: Add pinned packaging-tool ARGs and use them in the pip
install invocation so wheel outputs are reproducible; declare build-time args
like PIP_VERSION, BUILD_VERSION, and AUDITWHEEL_VERSION with sensible defaults
near the top of the Dockerfile, then replace the current "$PYBIN/pip" install
--upgrade pip build auditwheel call inside the client-wheel build stage to
install exact versions (e.g., pip==${PIP_VERSION}, build==${BUILD_VERSION},
auditwheel==${AUDITWHEEL_VERSION}) so the toolchain cannot drift or fail between
builds.
In `@docs/ARCHITECTURE.md`:
- Around line 67-70: Replace the ASCII repository tree block in ARCHITECTURE.md
with a Mermaid diagram: remove the 3-line ASCII snippet starting with "├──
docker/" and instead add a Mermaid code block (```mermaid```) that models the
same structure (root node "docker" with child nodes "Dockerfile",
"Dockerfile.client-wheel", and "docker-compose.yml") using a directed/top-down
graph (e.g., graph TD) and include brief labels matching the original comments;
ensure the new block follows markdown fenced code block syntax and adheres to
the repo's mermaid diagram conventions.
In `@docs/DEPLOYMENT.md`:
- Around line 234-236: The docs currently build the default (scratch) final
stage from docker/Dockerfile.client-wheel, so docker run mx-wheel-builder cp ...
fails because scratch has no cp; update the example to build the stage that
contains /dist by adding --target=export (or the appropriate stage name used in
docker/Dockerfile.client-wheel) to the docker build command (e.g., docker build
--target=export -f docker/Dockerfile.client-wheel -t mx-wheel-builder .) so the
subsequent docker run cp -r /dist/. /out/ works.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 69039901-6d0e-4d9b-9468-4623f6e5a632
📒 Files selected for processing (8)
CONTRIBUTING.mdREADME.mddocker/Dockerfiledocker/Dockerfile.client-wheeldocker/docker-compose.ymldocs/ARCHITECTURE.mddocs/DEPLOYMENT.mdmodelexpress_client/python/setup.py
| "$PYBIN/pip" install --upgrade pip build auditwheel; \ | ||
| "$PYBIN/python" -m build --wheel --outdir /tmp/raw .; \ |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify whether packaging tools are currently unpinned in the wheel builder Dockerfile.
rg -n 'pip" install --upgrade' docker/Dockerfile.client-wheel
rg -n 'pip==|build==|auditwheel==' docker/Dockerfile.client-wheelRepository: ai-dynamo/modelexpress
Length of output: 132
🏁 Script executed:
#!/bin/bash
set -euo pipefail
nl -ba docker/Dockerfile.client-wheel | sed -n '1,120p'Repository: ai-dynamo/modelexpress
Length of output: 109
🏁 Script executed:
#!/bin/bash
set -euo pipefail
cat -n docker/Dockerfile.client-wheel | sed -n '1,140p'Repository: ai-dynamo/modelexpress
Length of output: 2143
Pin packaging tool versions in docker/Dockerfile.client-wheel to keep wheel outputs reproducible.
docker/Dockerfile.client-wheel line 36 upgrades pip, build, and auditwheel with no version pins, so the toolchain can drift between builds and change wheel metadata/repair behavior. Ensure the pinned ARGs are set (or provide defaults) so the Docker build can’t fail.
Proposed fix
+ARG PIP_VERSION
+ARG BUILD_VERSION
+ARG AUDITWHEEL_VERSION
@@
- "$PYBIN/pip" install --upgrade pip build auditwheel; \
+ "$PYBIN/pip" install --upgrade \
+ "pip==${PIP_VERSION}" \
+ "build==${BUILD_VERSION}" \
+ "auditwheel==${AUDITWHEEL_VERSION}"; \🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@docker/Dockerfile.client-wheel` around lines 36 - 37, Add pinned
packaging-tool ARGs and use them in the pip install invocation so wheel outputs
are reproducible; declare build-time args like PIP_VERSION, BUILD_VERSION, and
AUDITWHEEL_VERSION with sensible defaults near the top of the Dockerfile, then
replace the current "$PYBIN/pip" install --upgrade pip build auditwheel call
inside the client-wheel build stage to install exact versions (e.g.,
pip==${PIP_VERSION}, build==${BUILD_VERSION}, auditwheel==${AUDITWHEEL_VERSION})
so the toolchain cannot drift or fail between builds.
| ├── docker/ | ||
| │ ├── Dockerfile # Multi-stage production image | ||
| │ ├── Dockerfile.client-wheel # Builds Python client wheels + sdist | ||
| │ └── docker-compose.yml # Single-service dev setup |
There was a problem hiding this comment.
Replace the repository tree ASCII block with a Mermaid diagram.
These lines extend an ASCII tree in markdown; this should be represented as Mermaid to meet repo docs standards.
As per coding guidelines, **/*.md: Use mermaid diagrams instead of ASCII art in markdown files.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@docs/ARCHITECTURE.md` around lines 67 - 70, Replace the ASCII repository tree
block in ARCHITECTURE.md with a Mermaid diagram: remove the 3-line ASCII snippet
starting with "├── docker/" and instead add a Mermaid code block (```mermaid```)
that models the same structure (root node "docker" with child nodes
"Dockerfile", "Dockerfile.client-wheel", and "docker-compose.yml") using a
directed/top-down graph (e.g., graph TD) and include brief labels matching the
original comments; ensure the new block follows markdown fenced code block
syntax and adheres to the repo's mermaid diagram conventions.
| docker build -f docker/Dockerfile.client-wheel -t mx-wheel-builder . | ||
| docker run --rm -v "$PWD/dist:/out" mx-wheel-builder cp -r /dist/. /out/ | ||
| ``` |
There was a problem hiding this comment.
The non-buildx example will fail with the default export stage image.
At Line 234 the build targets the Dockerfile default final stage (FROM scratch in docker/Dockerfile.client-wheel), so Line 235’s docker run ... cp ... has no cp binary to execute.
Proposed doc fix
-docker build -f docker/Dockerfile.client-wheel -t mx-wheel-builder .
-docker run --rm -v "$PWD/dist:/out" mx-wheel-builder cp -r /dist/. /out/
+docker build -f docker/Dockerfile.client-wheel --target builder -t mx-wheel-builder .
+docker run --rm -v "$PWD/dist:/out" mx-wheel-builder bash -lc 'cp -r /dist/. /out/'📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| docker build -f docker/Dockerfile.client-wheel -t mx-wheel-builder . | |
| docker run --rm -v "$PWD/dist:/out" mx-wheel-builder cp -r /dist/. /out/ | |
| ``` | |
| docker build -f docker/Dockerfile.client-wheel --target builder -t mx-wheel-builder . | |
| docker run --rm -v "$PWD/dist:/out" mx-wheel-builder bash -lc 'cp -r /dist/. /out/' |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@docs/DEPLOYMENT.md` around lines 234 - 236, The docs currently build the
default (scratch) final stage from docker/Dockerfile.client-wheel, so docker run
mx-wheel-builder cp ... fails because scratch has no cp; update the example to
build the stage that contains /dist by adding --target=export (or the
appropriate stage name used in docker/Dockerfile.client-wheel) to the docker
build command (e.g., docker build --target=export -f
docker/Dockerfile.client-wheel -t mx-wheel-builder .) so the subsequent docker
run cp -r /dist/. /out/ works.
After moving the root Dockerfile under docker/, four call sites still relied on the default ./Dockerfile lookup and broke: - .github/workflows/ci.yml (Docker Build job) - test_grpc_transfer_k8s.sh (Kubernetes gRPC Transfer Test CI job) - test_multinode_k8s.sh - examples/dynamo_model_cache_k8s/README.md Add an explicit -f docker/Dockerfile to each. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Swap python -m build for uv build in docker/Dockerfile.client-wheel. uv handles PEP 517 build isolation natively, so the per-Python loop no longer has to pip-install build/pip itself into each CPython; auditwheel is installed once into cp313 and reused. Custom BuildExtension in setup.py still fires under setuptools.build_meta, so MX_SKIP_EXT continues to work. Also strip uv's auto-generated /dist/.gitignore from the final scratch export so it doesn't leak into the published artifact set. Verified locally on linux/arm64: same 6 artifacts (4 manylinux wheels cp310-cp313, 1 py3-none-any wheel, 1 sdist), byte-identical sizes to the previous python -m build output. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
Add .github/workflows/build-wheels.yml: a self-contained workflow that
invokes docker/Dockerfile.client-wheel on every PR and every push to
main / release/** branches, building both archs in parallel on velonix
runners (prod-modelexpress-builder-{amd,arm}-v1) and uploading the
resulting 6 artifacts per arch (4 manylinux wheels cp310-cp313, 1
py3-none-any wheel, 1 sdist) to NV Artifactory.
Destination layout:
pull_request -> pr/<pr_id>/<sha>/<run_id>/<run_attempt>/<arch>/
push (main/release) -> post-merge/<sha>/<run_id>/<run_attempt>/<arch>/
Upload step is gated on the release_automation GitHub environment, which
holds three secrets: ARTIFACTORY_URL, ARTIFACTORY_TOKEN (JFrog identity
token; no separate user needed), and ARTIFACTORY_REPO_NAME. Concurrency
cancels superseded PR runs but never cancels post-merge runs.
Runs in parallel to (does not gate on) the existing CI jobs in ci.yml
and modelexpress-ci-tests.yml.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…EPO_NAME Match the conventional name for the PyPI-flavored Artifactory repo. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Match the existing org-wide environment name (aiperf uses the same). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Switch from pull_request to push: branches: [pull-request/*, main, release/**]. copy-pr-bot mirrors contributor PRs into the pull-request/<pr_id> branch namespace inside this repo, which lets the workflow access automated-release environment secrets that the pull_request event from a fork cannot reach. The subpath logic now reads the PR id from GITHUB_REF_NAME (stripping the pull-request/ prefix) instead of github.event.pull_request.number. Concurrency cancels superseded runs on pull-request/* branches but never on main / release/**. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Summary
docker/Dockerfile.client-wheel— new multi-arch (linux/amd64,linux/arm64) builder onquay.io/pypa/manylinux_2_28_{x86_64,aarch64}. Per target platform it produces:modelexpress.vmm._alloc_extagainst the matching CPython ABI and hardened withauditwheel repair;py3-none-anypure-Python wheel built withMX_SKIP_EXT=1;setup.py— newMX_SKIP_EXTenv knob: when set,_ext_modules()returns[]sobdist_wheelproduces a truepy3-none-anywheel instead of a platform-tagged wheel that just happens to contain no compiled code.Dockerfile→docker/Dockerfile,docker-compose.yml→docker/docker-compose.yml(context bumped to..). README, CONTRIBUTING,docs/DEPLOYMENT.md,docs/ARCHITECTURE.mdupdated to match.Motivation: the wheel currently on artifactory is tagged
linux_x86_64(no manylinux), only cp313, and the source pipeline for it lives outside this repo. This commit gives us an in-tree, reproducible cross-arch wheel builder that emits properly tagged manylinux wheels plus a pure-Python fallback and an sdist.Test plan
docker buildx build --platform linux/arm64 -f docker/Dockerfile.client-wheel --target export --output type=local,dest=./dist .→ 6 artifacts (4× cp3{10..13} manylinux_2_24+manylinux_2_28_aarch64 wheels, 1× py3-none-any, 1× sdist) verified locally.docker buildx build --platform linux/amd64 -f docker/Dockerfile.client-wheel --target export --output type=local,dest=./dist .→ same 6 artifacts for x86_64, verified locally (under QEMU on an Apple Silicon host).py3-none-anywheel contains no.so(only_alloc_ext.cppsource); manylinux wheels contain the compiled_alloc_ext.cpython-3XX-{aarch64,x86_64}-linux-gnu.so.docker compose -f docker/docker-compose.yml up --build(smoke test for the moved compose file).🤖 Generated with Claude Code
Summary by CodeRabbit
Documentation
Chores
docker/directory with updated configuration references.