Address some performance regressions in shuffle by wence- · Pull Request #1018 · rapidsai/rapidsmpf

wence- · 2026-05-08T11:04:09Z

After #927 we lost about 10% performance in the shuffle benchmarks when using cuda async memory.

Recover, from my benchmarking, most of this with a number of updates:

Switch back to closer to the pre-Support reuse of op_ids in shuffles #927 "wake" scheme. We now wake a waiter once all data is ready to be extracted and all sends have been posted (but not necessarily completed)
Rather than breaking when we see the first ready buffer per rank, post all receives up to the first non-ready buffer (TODO: confirm this is safe with message ordering, I think it is)
Apply a circulant shift to the polling for metadata so we don't all look for metadata from rank-0, then rank-1, etc...

In addition, to allow benchmarking just the communication part of the shuffle, add a "discard output without even concatenating it" mode to the shuffle benchmark.

copy-pr-bot · 2026-05-08T11:04:12Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

This allows benchmarking just the data movement part of the shuffle without the unspilling and concatenation of the results. Additionally, remove the unnecessary stream sync, the contract is the downstream data is available in stream-ordered fashion, so do that.

wence- · 2026-05-11T15:25:31Z

The waking is not safe yet.

We can wake if the MPE has finished polling for new metadata, and don't need to wait for it to be completely idle.

Seems from benchmarking this won't be worth it

wence- force-pushed the wence/fea/shuffle-perf branch 6 times, most recently from 92e2244 to fbebd32 Compare May 8, 2026 13:21

wence- force-pushed the wence/fea/shuffle-perf branch 2 times, most recently from 8e200d0 to 8be0963 Compare May 8, 2026 14:24

Wake once we have locally received everything

9cf3784

wence- force-pushed the wence/fea/shuffle-perf branch from 8be0963 to a583ce0 Compare May 8, 2026 16:29

wence- changed the title ~~Wence/fea/shuffle perf~~ Address some performance regressions in shuffle May 8, 2026

wence- added improvement Improves an existing functionality non-breaking Introduces a non-breaking change labels May 8, 2026

wence- marked this pull request as ready for review May 11, 2026 07:55

wence- requested a review from a team as a code owner May 11, 2026 07:55

wence- force-pushed the wence/fea/shuffle-perf branch 3 times, most recently from 0b25269 to a3fff0b Compare May 11, 2026 13:50

wence- added 3 commits May 11, 2026 14:55

Post all receives for prefix of ready messages

b33d592

Circulant shift for metadata receive in shuffle

92e13fe

Circulant shift for mpe termination marker send

671545a

wence- force-pushed the wence/fea/shuffle-perf branch from a3fff0b to 1c763bb Compare May 11, 2026 14:00

wence- added the DO NOT MERGE Hold off on merging; see PR for details label May 11, 2026

wence- added 2 commits May 11, 2026 17:17

Introduce finished_polling method to MPE

dd2bb61

We can wake if the MPE has finished polling for new metadata, and don't need to wait for it to be completely idle.

Inline shuffle message exchange again

011ed99

Seems from benchmarking this won't be worth it

wence- force-pushed the wence/fea/shuffle-perf branch from 1c763bb to 011ed99 Compare May 11, 2026 16:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Address some performance regressions in shuffle#1018

Address some performance regressions in shuffle#1018
wence- wants to merge 7 commits into
rapidsai:mainfrom
wence-:wence/fea/shuffle-perf

wence- commented May 8, 2026 •

edited

Loading

Uh oh!

copy-pr-bot Bot commented May 8, 2026

Uh oh!

wence- commented May 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

wence- commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

copy-pr-bot Bot commented May 8, 2026

Uh oh!

wence- commented May 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

wence- commented May 8, 2026 •

edited

Loading