[XPU][NIXL] Add GPUDirect RDMA support for XPU by zhenwei-intel · Pull Request #2 · zhenwei-intel/vllm

zhenwei-intel · 2026-02-25T07:24:05Z

Purpose

Add GPUDirect RDMA support for XPU in NIXL connector.

Requirements：

UCX must include the fix from UCT/IB/ZE: Enable GPUDirect RDMA for Intel Xe devices openucx/ucx#11187.

Limitations:

Must be set UCX_NET_DEVICES manually for better performance until UCT/ZE: Add device topology registration openucx/ucx#11180 is merged.
Current ze-ipc implementation has known performance issue.

Test Plan

Performance data of Llama3.3-70B int4 model with fp8 kvcache on 8xB60, ISL=1500, OSL=150
2P1D vs Non-PD under SLO TTFT<5s, ITL<100ms

Serve more requests: under SLO, 2P1D achieved a request throughput of 1.06, compared to 0.64 for the Non-PD — 1.65x improvement.

PD commands

prefill

export UCX_TLS=ib,rc,ze_copy

export ZE_AFFINITY_MASK=2,3
export model_name=ibnzterrell/Meta-Llama-3.3-70B-Instruct-AWQ-INT4
export tp_size=2


VLLM_USE_V1=1 VLLM_NIXL_SIDE_CHANNEL_HOST=localhost VLLM_NIXL_SIDE_CHANNEL_PORT=5577 VLLM_WORKER_MULTIPROC_METHOD=spawn VLLM_ENABLE_V1_MULTIPROCESSING=1 vllm serve $model_name -tp $tp_size --host localhost --port 7101 --seed 42 --enforce-eager --dtype float16 --gpu-memory-utilization 0.9 --kv-transfer-config '{"kv_connector":"NixlConnector","kv_role":"kv_both","kv_buffer_device":"xpu"}' --max-model-len 8192 --block-size 64 --no-enable-prefix-caching --kv-cache-dtype fp8

prefill2

export UCX_TLS=ib,rc,ze_copy

export ZE_AFFINITY_MASK=4,5
export model_name=ibnzterrell/Meta-Llama-3.3-70B-Instruct-AWQ-INT4
export tp_size=2


VLLM_USE_V1=1 VLLM_NIXL_SIDE_CHANNEL_HOST=localhost VLLM_NIXL_SIDE_CHANNEL_PORT=5377 VLLM_WORKER_MULTIPROC_METHOD=spawn VLLM_ENABLE_V1_MULTIPROCESSING=1 vllm serve $model_name -tp $tp_size --host localhost --port 7102 --seed 42 --enforce-eager --dtype float16 --gpu-memory-utilization 0.9 --kv-transfer-config '{"kv_connector":"NixlConnector","kv_role":"kv_both","kv_buffer_device":"xpu"}' --max-model-len 8192 --block-size 64 --no-enable-prefix-caching--kv-cache-dtype fp8

decode

export UCX_TLS=ib,rc,ze_copy

export ZE_AFFINITY_MASK=0,1
export model_name=ibnzterrell/Meta-Llama-3.3-70B-Instruct-AWQ-INT4
export tp_size=2


VLLM_USE_V1=1 VLLM_NIXL_SIDE_CHANNEL_HOST=localhost VLLM_NIXL_SIDE_CHANNEL_PORT=5177 VLLM_WORKER_MULTIPROC_METHOD=spawn VLLM_ENABLE_V1_MULTIPROCESSING=1 vllm serve $model_name -tp $tp_size --host localhost --port 7201 --seed 42 --enforce-eager --dtype float16 --gpu-memory-utilization 0.9 --kv-transfer-config '{"kv_connector":"NixlConnector","kv_role":"kv_both","kv_buffer_device":"xpu"}' --max-model-len 8192 --block-size 64 --no-enable-prefix-caching --kv-cache-dtype fp8

proxy

python3 tests/v1/kv_connector/nixl_integration/toy_proxy_server.py --prefiller-hosts localhost  localhost --prefiller-ports 7101 7102 --decoder-host localhost --decoder-port 7201 --host localhost --port 7300

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

gemini-code-assist

Code Review

This pull request introduces support for GPUDirect RDMA on XPU devices by integrating NIXL and UCX. Key changes include updating the UCX build process to enable Intel's oneAPI Level Zero (ZE) API, extending NIXL connector support for XPU KV buffer types, and implementing a critical environment variable setting to prevent potential UCX memory misdetection issues on XPU. These changes are essential for enabling efficient KV cache transfers on XPU platforms.

gemini-code-assist · 2026-02-25T07:26:49Z

+        # In some cases, the internal memory type cache can misdetect GPU
+        # memory as host memory, also leading to invalid memory access.
+        # This cache can be disabled by setting UCX_MEMTYPE_CACHE=n.
+        # ref. https://openucx.readthedocs.io/en/master/faq.html
+        os.environ["UCX_MEMTYPE_CACHE"] = "n"


Setting UCX_MEMTYPE_CACHE=n is a critical change to prevent potential memory misdetection issues with UCX on XPU. This directly addresses a potential cause of invalid memory access, which could lead to crashes or data corruption. It's good that this is explicitly documented with a reference to the UCX FAQ.

Copilot

Pull request overview

This pull request adds GPUDirect RDMA support for Intel XPU devices in the NIXL KV transfer connector. The changes enable direct memory access between XPU devices for more efficient KV cache transfers during distributed inference.

Changes:

Enable XPU device memory for NIXL KV transfers (previously only CPU memory was supported)
Configure UCX to disable memory type caching to avoid GPU memory misdetection
Update UCX build configuration to enable Intel Level Zero (ZE) support for XPU devices

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File	Description
vllm/platforms/xpu.py	Adds UCX_MEMTYPE_CACHE environment variable configuration to prevent memory type misdetection
vllm/distributed/kv_transfer/kv_connector/v1/nixl_connector.py	Adds "xpu" to supported KV buffer devices for XPU platform and simplifies memory type selection logic
tools/install_nixl_from_source_ubuntu.py	Updates UCX build to use specific commit and enable ZE support for XPU compatibility

Comments suppressed due to low confidence (1)

vllm/distributed/kv_transfer/kv_connector/v1/nixl_connector.py:956

There is a logic error in this code. After line 951, nixl_memory_type will always have a value (either from current_platform.get_nixl_memory_type() or from the ternary expression), so the condition on line 952 checking if nixl_memory_type is None will never be true. This makes the error handling at lines 952-956 unreachable code.

The check on line 952 appears to be a leftover from the previous code structure and should be removed.

        if nixl_memory_type is None:
            raise RuntimeError(
                f"{self.device_type} with {self.kv_buffer_device} kv_buffer "
                "is not supported."
            )

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-25T07:27:21Z

+        # In some cases, the internal memory type cache can misdetect GPU
+        # memory as host memory, also leading to invalid memory access.
+        # This cache can be disabled by setting UCX_MEMTYPE_CACHE=n.
+        # ref. https://openucx.readthedocs.io/en/master/faq.html
+        os.environ["UCX_MEMTYPE_CACHE"] = "n"


The UCX_MEMTYPE_CACHE environment variable is being set unconditionally for all XPU configurations, but this setting is specifically for GPUDirect RDMA support which is only relevant when KV transfer is enabled. This could have unintended side effects on configurations that don't use KV transfer.

Consider making this setting conditional on kv_transfer_config being enabled, similar to how other KV transfer related configurations are handled in this method (see line 184).

Copilot · 2026-02-25T07:27:22Z

@@ -139,7 +139,7 @@ def build_and_install_prerequisites(args):
    if not os.path.exists(UCX_DIR):
        run_command(["git", "clone", UCX_REPO_URL, UCX_DIR])
    ucx_source_path = os.path.abspath(UCX_DIR)


The change from checking out a branch (v1.19.x) to a specific commit hash (e5d9887) lacks documentation explaining why this specific commit is needed. This makes it difficult for maintainers to understand:

What version or features this commit corresponds to

Why this specific commit was chosen over the v1.19.x branch

Whether this commit includes important fixes or features for GPUDirect RDMA support

Consider adding a comment explaining what this commit represents and why it was chosen, or document this in the PR description.

Suggested change

ucx_source_path = os.path.abspath(UCX_DIR)

ucx_source_path = os.path.abspath(UCX_DIR)

# NOTE: We pin UCX to a specific, known-good commit instead of tracking

# a moving branch (e.g., v1.19.x). Commit e5d9887 corresponds to a UCX

# revision that has been validated with nixl for our GPUDirect RDMA

# use case. If you update this hash, please ensure the new commit has

# been tested with nixl and GPUDirect RDMA and update this comment.

Copilot · 2026-02-25T07:27:22Z

+        # In some cases, the internal memory type cache can misdetect GPU
+        # memory as host memory, also leading to invalid memory access.
+        # This cache can be disabled by setting UCX_MEMTYPE_CACHE=n.
+        # ref. https://openucx.readthedocs.io/en/master/faq.html
+        os.environ["UCX_MEMTYPE_CACHE"] = "n"


The PR description is empty except for the template. This makes it difficult to understand:

What problem this PR is solving

How GPUDirect RDMA support was tested

What the expected performance or functional improvements are

What the changes to UCX configuration (ZE support, specific commit) are intended to accomplish

Please complete the PR description with the purpose, test plan, and test results as indicated by the template checklist.

zhenwei-intel · 2026-02-25T07:34:21Z

@copilot help review again

Copilot · 2026-02-25T07:34:28Z

@zhenwei-intel I've opened a new pull request, #3, to work on those changes. Once the pull request is ready, I'll request review from you.

…35338) Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>

…consumer NVIDIA GPUs (vllm-project#33992) Signed-off-by: Seungmin Kim <8457324+ehfd@users.noreply.github.com> Signed-off-by: Andrew Mello <19512127+88plug@users.noreply.github.com> Co-authored-by: 88plug <19512127+88plug@users.noreply.github.com> Co-authored-by: Michael Goin <mgoin64@gmail.com>

Signed-off-by: LopezCastroRoberto <rocastro@redhat.com>

Signed-off-by: hujiaxin <524446785@qq.com> Signed-off-by: Emilie1001 <79921183+Emilie1001@users.noreply.github.com> Co-authored-by: Emilie1001 <79921183+Emilie1001@users.noreply.github.com> Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>

…t#34109) Signed-off-by: hjjq <50634613+hjjq@users.noreply.github.com> Signed-off-by: wzhao18 <wzhao18.sz@gmail.com> Co-authored-by: wzhao18 <wzhao18.sz@gmail.com> Co-authored-by: Wei Zhao <51183510+wzhao18@users.noreply.github.com>

Signed-off-by: Woosuk Kwon <woosuk@inferact.ai>

…ect#34890) Signed-off-by: Fadi Arafeh <fadi.arafeh@arm.com> Co-authored-by: Tyler Michael Smith <tlrmchlsmth@gmail.com>

…roject#28672) Signed-off-by: jasonlizhengjian <jasonlizhengjian@gmail.com> Signed-off-by: Jason Li <jasonlizhengjian@gmail.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>

vllm-project#34887) Signed-off-by: Daniele Trifirò <dtrifiro@redhat.com> Co-authored-by: Nicolò Lucchesi <nlucches@redhat.com>

…rmony.py (vllm-project#35339) Signed-off-by: sfeng33 <4florafeng@gmail.com>

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

…IGC_ForceOCLSIMDWidth=16` (vllm-project#35298) Signed-off-by: Ofir Zafrir <ofir.zafrir@intel.com> Co-authored-by: Kunshang Ji <kunshang.ji@intel.com>

Signed-off-by: chzhang <chaojun.zhang@intel.com> Co-authored-by: Kunshang Ji <kunshang.ji@intel.com>

… before compilation config init (vllm-project#34848) Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>

vllm-project#35250) Signed-off-by: c0de128 <kevin.mckay@outlook.com>

…t#35283) Signed-off-by: KrxGu <krishom70@gmail.com>

Signed-off-by: zjy0516 <riverclouds.zhu@qq.com>

…oject#35220) Signed-off-by: Sophie du Couédic <sop@zurich.ibm.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>

…4le (vllm-project#35081) Signed-off-by: Akash kaothalkar <akash.kaothalkar@ibm.com> Co-authored-by: Akash kaothalkar <akash.kaothalkar@ibm.com>

… loading (vllm-project#35289) Signed-off-by: daowu.hzy <daowu.hzy@alibaba-inc.com> Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>

…35581)

Signed-off-by: emricksini-h <emrick.birivoutin@hcompany.ai> Co-authored-by: Roger Wang <hey@rogerw.io>

) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

Signed-off-by: yewentao256 <zhyanwentao@126.com> Signed-off-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>

Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>

…vllm-project#35557) Signed-off-by: Martin Vit <martin@voipmonitor.org>

Signed-off-by: Woosuk Kwon <woosuk@inferact.ai>

Signed-off-by: zhenwei-intel <zhenwei.liu@intel.com>

Signed-off-by: gnovack <gnovack@amazon.com> Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>

…GPUs (vllm-project#35617)

…#35646)

…t#34931) Signed-off-by: Ryan Rock <ryan.rock@amd.com>

…5628) Signed-off-by: Woosuk Kwon <woosuk@inferact.ai>

…ect#34798) Signed-off-by: Josephasafg <ajgard7@gmail.com>

…XCLUDE (vllm-project#35630) Signed-off-by: Taneem Ibrahim <taneem.ibrahim@gmail.com>

…n backend (vllm-project#35382) Signed-off-by: Seungho Yoon <yoonsnowdev@gmail.com>

…ct#35671) Signed-off-by: Woosuk Kwon <woosuk@inferact.ai>

…orch.compile (vllm-project#35256) Signed-off-by: haosdent <haosdent@gmail.com>

…m-project#35475) Signed-off-by: Richard Zou <zou3519@gmail.com>

…llm-project#33192)" (vllm-project#34832) Signed-off-by: Zhanqiu Hu <zh338@cornell.edu>

Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com> Signed-off-by: Matthew Bonanni <mbonanni@redhat.com> Signed-off-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com> Co-authored-by: Matthew Bonanni <mbonanni@redhat.com> Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>

Signed-off-by: Jesse Cai <jessecai@fb.com>

Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>

…d models (vllm-project#35658) Signed-off-by: Hongxia Yang <hongxiay.yang@amd.com> Co-authored-by: Hongxia Yang <hongxiay.yang@amd.com>

…ix non-determinism (vllm-project#35152) Signed-off-by: Andreas Karatzas <akaratza@amd.com>

…ls (vllm-project#34448) Signed-off-by: EdalatiAli <aliedalati@cohere.com> Signed-off-by: Michael Goin <mgoin64@gmail.com> Co-authored-by: Michael Goin <mgoin64@gmail.com>

…4750) Signed-off-by: charlifu <charlifu@amd.com>

Signed-off-by: Andreas Karatzas <akaratza@amd.com>

Signed-off-by: NickLucche <nlucches@redhat.com>

zhenwei-intel requested a review from Copilot February 25, 2026 07:24

Copilot started reviewing on behalf of zhenwei-intel February 25, 2026 07:24 View session

gemini-code-assist Bot reviewed Feb 25, 2026

View reviewed changes

Copilot AI reviewed Feb 25, 2026

View reviewed changes

zhenwei-intel force-pushed the xpu_pd_2026 branch from b361c4c to ee1bcfb Compare February 25, 2026 07:33

Copilot AI mentioned this pull request Feb 25, 2026

[XPU][NIXL] Support GPUDirect RDMA for KV cache transfers #3

Draft

zhenwei-intel changed the title ~~[XPU][NIXL] support GPUDirect RDMA~~ [XPU][NIXL] Add GPUDirect RDMA support for XPU in NIXL connector Feb 25, 2026

zhenwei-intel changed the title ~~[XPU][NIXL] Add GPUDirect RDMA support for XPU in NIXL connector~~ [XPU][NIXL] Add GPUDirect RDMA support for XPU Feb 25, 2026

LucasWilkinson and others added 20 commits February 25, 2026 18:01

[Bugfix] Fix AttributeError in SMControlContextManager (vllm-project#…

9511a3f

…35338) Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>

[BugFix] Fix fp4 quant kernel on CUDA 12.8 (vllm-project#35210)

86c3b5a

Signed-off-by: LopezCastroRoberto <rocastro@redhat.com>

[Model Runner V2] Add coding style guide (vllm-project#35325)

13025e7

Signed-off-by: Woosuk Kwon <woosuk@inferact.ai>

[CPU][Feat] Enable KleidiAI INT8_W4A8 for all input dtypes (vllm-proj…

4171ff6

…ect#34890) Signed-off-by: Fadi Arafeh <fadi.arafeh@arm.com> Co-authored-by: Tyler Michael Smith <tlrmchlsmth@gmail.com>

[BugFix] anthropic/serving_messages: fix tool call arguments streaming (

4a9c07a

vllm-project#34887) Signed-off-by: Daniele Trifirò <dtrifiro@redhat.com> Co-authored-by: Nicolò Lucchesi <nlucches@redhat.com>

[Misc][Harmony] Move Responses API only harmony utils to responses/ha…

186ea22

…rmony.py (vllm-project#35339) Signed-off-by: sfeng33 <4florafeng@gmail.com>

[Benchmark] Simplify SLA scan (vllm-project#35306)

d3a51da

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

[BugFix][XPU] Fix speculative decoding on Intel XPU due to bug with `…

a07c4c5

…IGC_ForceOCLSIMDWidth=16` (vllm-project#35298) Signed-off-by: Ofir Zafrir <ofir.zafrir@intel.com> Co-authored-by: Kunshang Ji <kunshang.ji@intel.com>

[XPU][8/N] Fix kernel bugs in XPU LoRA and MOE LORA (vllm-project#34115)

9f9a675

Signed-off-by: chzhang <chaojun.zhang@intel.com> Co-authored-by: Kunshang Ji <kunshang.ji@intel.com>

[ROCm] Add extra step in config initialization to populate custom ops…

6042e66

… before compilation config init (vllm-project#34848) Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>

[Bugfix][Hardware][AMD] Gate FP4 ops on gfx950 to prevent MI300X crash (

ade81f1

vllm-project#35250) Signed-off-by: c0de128 <kevin.mckay@outlook.com>

[Test] Add tests for n parameter in chat completions API (vllm-projec…

3827c8c

…t#35283) Signed-off-by: KrxGu <krishom70@gmail.com>

[Model] Ring 2.5 (vllm-project#35102)

ab87f85

Signed-off-by: zjy0516 <riverclouds.zhu@qq.com>

[Benchmarks] Plot benchmark timeline and requests statistics (vllm-pr…

02acd16

…oject#35220) Signed-off-by: Sophie du Couédic <sop@zurich.ibm.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>

[Hardware][Powerpc]Enable prefix caching and chunked prefill for ppc6…

e03ddcf

…4le (vllm-project#35081) Signed-off-by: Akash kaothalkar <akash.kaothalkar@ibm.com> Co-authored-by: Akash kaothalkar <akash.kaothalkar@ibm.com>

[Bugfix] [Qwen3.5]Fix Qwen3.5 FP8 quantization: tuple shard_id weight…

32693db

… loading (vllm-project#35289) Signed-off-by: daowu.hzy <daowu.hzy@alibaba-inc.com> Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>

cwazai and others added 30 commits February 28, 2026 14:50

Fix Qwen3_5MTP packed_modules_mapping for gate_up_proj (vllm-project#…

63d7972

…35581)

[Fix] Avoid sending image input to other PP ranks (vllm-project#35405)

49b9ae3

Signed-off-by: emricksini-h <emrick.birivoutin@hcompany.ai> Co-authored-by: Roger Wang <hey@rogerw.io>

[Benchmark] Avoid unnecessary video download in MMVU (vllm-project#35618

1dafb29

) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

[Deprecation] Deprecate code in 0.17 as scheduled (vllm-project#35441)

e113a30

Signed-off-by: yewentao256 <zhyanwentao@126.com> Signed-off-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>

[Chore] Cleanup BNB utilization dead code (vllm-project#35620)

e94b263

Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>

[Bugfix] Fix Anthropic API base64 image handling in Messages endpoint (…

95a395d

…vllm-project#35557) Signed-off-by: Martin Vit <martin@voipmonitor.org>

[Model Runner V2] Add ModelStateInterface [4/N] (vllm-project#35621)

e3eb146

Signed-off-by: Woosuk Kwon <woosuk@inferact.ai>

update dockerfile

795da8e

Signed-off-by: zhenwei-intel <zhenwei.liu@intel.com>

Add TMA support to fused_moe_lora kernel (vllm-project#32195)

3ecd0bf

Signed-off-by: gnovack <gnovack@amazon.com> Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>

[Bugfix][Model] Fix Qwen3.5/Qwen3Next ignoring --dtype flag on older …

afd089f

…GPUs (vllm-project#35617)

Fix typo: implictly -> implicitly in isaac.py docstring (vllm-project…

a9ec392

…#35646)

[AMD][CI] Support Triton attention with ExampleConnector (vllm-projec…

87d319c

…t#34931) Signed-off-by: Ryan Rock <ryan.rock@amd.com>

[Model Runner V2] Minor refactoring for EncoderRunner (vllm-project#3…

da543d1

…5628) Signed-off-by: Woosuk Kwon <woosuk@inferact.ai>

[Mamba1] - Kernel Level Chunk Alignment for Prefix Caching (vllm-proj…

bbf81f9

…ect#34798) Signed-off-by: Josephasafg <ajgard7@gmail.com>

[MISC] Fixing a null reference by removing parallel_utils from mypy E…

59d7af9

…XCLUDE (vllm-project#35630) Signed-off-by: Taneem Ibrahim <taneem.ibrahim@gmail.com>

fix(mxfp4): return is_monolithic=False when LoRA is enabled for Trito…

5a43550

…n backend (vllm-project#35382) Signed-off-by: Seungho Yoon <yoonsnowdev@gmail.com>

[Model Runner V2] Use block table apis for capture inputs (vllm-proje…

72f4d16

…ct#35671) Signed-off-by: Woosuk Kwon <woosuk@inferact.ai>

[Bugfix] Fix dtype mismatch in RMSNormGated.forward_native() during t…

6290470

…orch.compile (vllm-project#35256) Signed-off-by: haosdent <haosdent@gmail.com>

[torch.compile] Undo the fast_moe_cold_start hack in torch>=2.11 (vll…

e82fbee

…m-project#35475) Signed-off-by: Richard Zou <zou3519@gmail.com>

Revert "[Bugfix] Disable TRTLLM attention with KV transfer enabled (v…

57a96e2

…llm-project#33192)" (vllm-project#34832) Signed-off-by: Zhanqiu Hu <zh338@cornell.edu>

Fix deprecated v1 config tests (vllm-project#35327)

a60985b

Signed-off-by: Jesse Cai <jessecai@fb.com>

[XPU] fix mxfp4 activation type (vllm-project#35691)

92f5d0f

Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>

[ROCm] add amd-quark package in requirements for rocm to use quantize…

f26650d

…d models (vllm-project#35658) Signed-off-by: Hongxia Yang <hongxiay.yang@amd.com> Co-authored-by: Hongxia Yang <hongxiay.yang@amd.com>

[ROCm][CI] Disable skinny GEMMs in language model standard tests to f…

c34963f

…ix non-determinism (vllm-project#35152) Signed-off-by: Andreas Karatzas <akaratza@amd.com>

[Kernel] Integrate SM100 MXFP8 blockscaled grouped MM and quant kerne…

cb21972

…ls (vllm-project#34448) Signed-off-by: EdalatiAli <aliedalati@cohere.com> Signed-off-by: Michael Goin <mgoin64@gmail.com> Co-authored-by: Michael Goin <mgoin64@gmail.com>

[Rocm][CI] Fix LM Eval Large Models (H100) test group (vllm-project#3…

3fd1d4e

…4750) Signed-off-by: charlifu <charlifu@amd.com>

[CI] Defining extended V1 e2e + engine tests (vllm-project#35580)

ec27b36

Signed-off-by: Andreas Karatzas <akaratza@amd.com>

[Misc] Bound NIXL upper bound version (vllm-project#35495)

c212202

Signed-off-by: NickLucche <nlucches@redhat.com>

Merge branch 'main' into xpu_pd_2026

6079f20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[XPU][NIXL] Add GPUDirect RDMA support for XPU#2

[XPU][NIXL] Add GPUDirect RDMA support for XPU#2
zhenwei-intel wants to merge 155 commits into
mainfrom
xpu_pd_2026

zhenwei-intel commented Feb 25, 2026 •

edited by github-actions Bot

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Feb 25, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 25, 2026

Uh oh!

Copilot AI Feb 25, 2026

Uh oh!

Copilot AI Feb 25, 2026

Uh oh!

zhenwei-intel commented Feb 25, 2026

Uh oh!

Copilot AI commented Feb 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

-    ucx_source_path = os.path.abspath(UCX_DIR)
+    ucx_source_path = os.path.abspath(UCX_DIR)
+    # NOTE: We pin UCX to a specific, known-good commit instead of tracking
+    # a moving branch (e.g., v1.19.x). Commit e5d9887 corresponds to a UCX
+    # revision that has been validated with nixl for our GPUDirect RDMA
+    # use case. If you update this hash, please ensure the new commit has
+    # been tested with nixl and GPUDirect RDMA and update this comment.

Conversation

zhenwei-intel commented Feb 25, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

PD commands

Test Result

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

zhenwei-intel commented Feb 25, 2026

Uh oh!

Copilot AI commented Feb 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

zhenwei-intel commented Feb 25, 2026 •

edited by github-actions Bot

Loading