TL/UCP: use memcpy instead of sendrecv in agv by Sergei-Lebedev · Pull Request #1255 · openucx/ucc

Sergei-Lebedev · 2026-01-14T07:48:04Z

What

Use ucc_mc_memcpy instead of sendrecv in allgatherv ring

Why ?

UCP doesn't support efficient copy of CUDA buffers in self transport. It does it through host staging buffer

greptile-apps · 2026-01-14T09:44:10Z

Greptile Summary

Replaced inefficient UCP self-transport sendrecv with ucc_mc_memcpy for local buffer copy in allgatherv ring algorithm. UCP doesn't efficiently copy CUDA buffers through self-transport and uses host staging buffers, while ucc_mc_memcpy provides direct GPU-to-GPU copies.

Replaced local sendrecv pair with single ucc_mc_memcpy call in non-inplace case
Adjusted ring algorithm loop from tsize to tsize - 1 iterations to account for local copy being done upfront
Updated send_idx calculation to remove + 1 offset (now trank - send_posted)
Updated recv_idx calculation to add - 1 offset (now trank - recv_posted - 1)
Removed inplace branch that manually set counters, as loop now handles all cases uniformly
Added profiling events for start and completion
Simplified error handling by removing separate error label

The index adjustments align with the pattern used in allgather_ring.c and correctly implement the ring algorithm with upfront local copy.

Confidence Score: 5/5

This PR is safe to merge with minimal risk
The changes follow an established pattern from allgather_ring.c, with correct mathematical adjustments to loop bounds and index calculations. The replacement of sendrecv with memcpy is a straightforward performance optimization that addresses UCP's inefficient handling of CUDA self-transport. Error handling has been appropriately simplified.
No files require special attention

Important Files Changed

Filename	Overview
src/components/tl/ucp/allgatherv/allgatherv_ring.c	Replaced inefficient UCP self-transport sendrecv with direct memcpy for CUDA buffers, adjusted ring algorithm loop bounds and indices accordingly

wfaderhold21

Looks good

wfaderhold21 · 2026-01-14T17:24:32Z

src/components/tl/ucp/allgatherv/allgatherv_ring.c

+        data_size = ucc_coll_args_get_count(
+                        args, args->dst.info_v.counts, grank) *
+                    rdt_size;
+        status = ucc_mc_memcpy(


LGTM! question though: would ucc_mc_memcpy potentially cause a deadlock if using cuda buffers for a scenario of NCCL + UCC? should this be a local copy instead?

janjust · 2026-01-17T03:12:01Z

/build

janjust · 2026-01-23T15:43:03Z

/build

janjust · 2026-01-23T21:50:08Z

/build

janjust · 2026-01-29T16:17:04Z

/build

Sergei-Lebedev added the Ready-for-Review label Jan 14, 2026

Sergei-Lebedev requested review from janjust, nsarka and wfaderhold21 January 14, 2026 09:42

wfaderhold21 approved these changes Jan 14, 2026

View reviewed changes

janjust approved these changes Jan 17, 2026

View reviewed changes

TL/UCP: use memcpy instead of sendrecv in agv

b5282de

janjust force-pushed the topic/allgatherv_memcpy branch from 89eb485 to b5282de Compare January 23, 2026 15:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TL/UCP: use memcpy instead of sendrecv in agv#1255

TL/UCP: use memcpy instead of sendrecv in agv#1255
Sergei-Lebedev wants to merge 1 commit intoopenucx:masterfrom
Sergei-Lebedev:topic/allgatherv_memcpy

Sergei-Lebedev commented Jan 14, 2026

Uh oh!

greptile-apps bot commented Jan 14, 2026 •

edited

Loading

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Uh oh!

wfaderhold21 left a comment

Uh oh!

wfaderhold21 Jan 14, 2026

Uh oh!

janjust commented Jan 17, 2026

Uh oh!

janjust commented Jan 23, 2026

Uh oh!

janjust commented Jan 23, 2026

Uh oh!

janjust commented Jan 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Sergei-Lebedev commented Jan 14, 2026

What

Why ?

Uh oh!

greptile-apps bot commented Jan 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Uh oh!

wfaderhold21 left a comment

Choose a reason for hiding this comment

Uh oh!

wfaderhold21 Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

janjust commented Jan 17, 2026

Uh oh!

janjust commented Jan 23, 2026

Uh oh!

janjust commented Jan 23, 2026

Uh oh!

janjust commented Jan 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

greptile-apps bot commented Jan 14, 2026 •

edited

Loading