[XPU] Upgrade NIXL to remove CUDA dependency#1
Conversation
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run You ask your reviewers to trigger select CI tests on top of Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add If you have any questions, please reach out to us on Slack at https://slack.vllm.ai. 🚀 |
There was a problem hiding this comment.
Code Review
This pull request adds the installation of nixl from source to the XPU Dockerfile. The changes are mostly correct, but I've identified a high-severity issue in docker/Dockerfile.xpu where a hardcoded path is used for LD_LIBRARY_PATH. This will make the Docker image difficult to maintain and prone to breaking when the Python version is updated. My review includes a detailed comment on how to address this.
|
|
||
| # install nixl from source code | ||
| RUN python3 /workspace/vllm/tools/install_nixl_from_source_ubuntu.py | ||
| ENV LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/lib/python3.12/dist-packages/.nixl.mesonpy.libs/plugins/" |
There was a problem hiding this comment.
The hardcoded Python version (python3.12) in the LD_LIBRARY_PATH makes this Dockerfile brittle. If the Python version is updated in this file (e.g., on line 20), this path will also need to be manually updated. Forgetting to do so will cause runtime errors because the nixl plugin will not be found.
To make this more robust, I recommend determining the path dynamically. A good way to achieve this is by using a shell-based ENTRYPOINT that computes the path and sets the environment variable before executing vllm serve.
For example, you could replace this ENV line and the existing ENTRYPOINT on line 77 with a dynamic entrypoint:
# (remove the ENV line for LD_LIBRARY_PATH)
...
ENTRYPOINT ["/bin/bash", "-c", "LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$(python3 -c 'import site; print(site.getsitepackages()[0])')/.nixl.mesonpy.libs/plugins/ exec vllm serve \"$@\"", "vllm-serve"]This approach avoids hardcoding the Python version and makes the Docker image more maintainable.
Signed-off-by: zhenwei-intel <zhenwei.liu@intel.com>
fb1b789 to
edd75de
Compare
Signed-off-by: zhenwei-intel <zhenwei.liu@intel.com>
Signed-off-by: zhenwei-intel <zhenwei.liu@intel.com>
zhenwei-intel
left a comment
There was a problem hiding this comment.
I added two comments, pls take a look
| pytest -v -s v1/structured_output | ||
| pytest -v -s v1/spec_decode --ignore=v1/spec_decode/test_max_len.py --ignore=v1/spec_decode/test_tree_attention.py | ||
| pytest -v -s v1/kv_connector/unit --ignore=v1/kv_connector/unit/test_multi_connector.py --ignore=v1/kv_connector/unit/test_nixl_connector.py --ignore=v1/kv_connector/unit/test_shared_storage_connector.py | ||
| pytest -v -s v1/test_metrics |
| # install development dependencies (for testing) | ||
| RUN python3 -m pip install -e tests/vllm_test_utils | ||
|
|
||
| # install nixl from source code |
There was a problem hiding this comment.
| # install nixl from source code | |
| # Install NIXL from source code |
test2
Update Dockerfile.xpu to build and install NIXL from source code, eliminating the libcuda dependency for better compatibility with XPU environments.