Skip to content

[feat] Support HIXL point-to-point tensor transport for Ray RDT #57

@zrss

Description

@zrss

Description

Ray Direct Transport (RDT) supports pluggable tensor transports. Upstream Ray provides:

  • Collective transports (Gloo, NCCL) — require create_collective_group
  • Point-to-point transports (NIXL) — one-sided RDMA P2P, typically without a collective group

On Ascend, #9 / PR #21 add HCCL as the collective RDT transport (NCCL/Gloo analogue). This issue requests HiXL as a separate point-to-point RDT tensor transport (NIXL analogue).

HCCL (#9) HiXL (this issue)
Ray analogue NCCL / Gloo NIXL
Model Collective, create_collective_group required P2P, one-sided RDMA
API @ray.method(tensor_transport="HCCL") + HCCL group @ray.method(tensor_transport="hixl"); no group for typical P2P transfers

HiXL references

  • Repository: CANN/hixl — Ascend one-sided communication library (Huawei Xfer Library)

Proposed scope: Implement / register an hixl-backed HiXLTensorTransport in ray-ascend (e.g. via register_tensor_transport("hixl", ["npu"], HiXLTensorTransport)), covering memory registration, one-sided P2P transfer, lifecycle/cleanup, docs, and a minimal two-actor example without create_collective_group. Document when to use hixl vs HCCL.

Related: #9, PR #21

Use Case

  1. Dynamic actor-to-actor tensor handoff — Prefill → Decode, pipeline stage handoffs, or ad-hoc weight shards between actors that are not in a fixed HCCL collective group.

  2. KV cache / activation transfer — Low-latency, low-copy P2P moves aligned with Ascend inference stacks that already use HiXL (e.g. KV pool / PD-disaggregation paths; see also vLLM-Ascend KV pool guide).

  3. Parity with Ray RDT on GPU — Same choice as upstream: HCCL for collective/group workloads (#9); HiXL for NIXL-style P2P when object-store serialization is too expensive.

  4. ray.get across actors — One-sided P2P fits fetch patterns where the caller is not part of an HCCL collective group (collective RDT transports are a poor fit there).

Metadata

Metadata

Labels

enhancementNew feature or request

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions