Skip to content

[XPU][NIXL] Support GPUDirect RDMA for KV cache transfers#3

Draft
Copilot wants to merge 3 commits into
xpu_pd_2026from
copilot/sub-pr-2
Draft

[XPU][NIXL] Support GPUDirect RDMA for KV cache transfers#3
Copilot wants to merge 3 commits into
xpu_pd_2026from
copilot/sub-pr-2

Conversation

Copy link
Copy Markdown

Copilot AI commented Feb 25, 2026

Enables GPUDirect RDMA for Intel XPU devices in the NIXL KV transfer connector, allowing direct device-to-device KV cache transfers without staging through host memory.

Changes

  • nixl_connector.py: Add "xpu" to supported KV buffer devices so XPU device memory is used directly for NIXL transfers instead of CPU buffers.
  • vllm/platforms/xpu.py: Set UCX_MEMTYPE_CACHE=n when kv_transfer_config is enabled to prevent UCX from misdetecting XPU device memory as host memory. Scoped to KV transfer path only — no effect on non-KV-transfer XPU workloads.
  • tools/install_nixl_from_source_ubuntu.py: Pin UCX to commit e5d9887 and enable --with-ze at configure time. This is the first UCX revision with Intel Level Zero GPU memory registration support required for XPU GDR. Documented with guidance for future hash updates.

Performance

Llama3.3-70B int4, fp8 KV cache, 8×B60, ISL=1500, OSL=150 — under SLO (TTFT<5s, ITL<100ms):

Config Requests/sec
Non-PD (tp=2 × 3 round-robin) 0.64
2P1D (NIXL XPU GDR) 1.06 — 1.65× improvement

Throughput comparison

Prerequisite

Requires UCX built from commit e5d9887 or later with --with-ze. The install script handles this automatically.


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

zhenwei-intel and others added 2 commits February 24, 2026 23:15
Signed-off-by: zhenwei-intel <zhenwei.liu@intel.com>
…X commit comment

Co-authored-by: zhenwei-intel <109187816+zhenwei-intel@users.noreply.github.com>
Copilot AI changed the title [WIP] Support GPUDirect RDMA in NIXL on XPU [XPU][NIXL] Support GPUDirect RDMA for KV cache transfers Feb 25, 2026
Copilot AI requested a review from zhenwei-intel February 25, 2026 07:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants