Allow shuffle extraction before protocol drain by pentschev · Pull Request #1019 · rapidsai/rapidsmpf

pentschev · 2026-05-08T12:06:23Z

#927 made Shuffler::wait() wait until all internal protocol/send cleanup had drained before returning, which introduced a 10-20% regression in bench_shuffle on an 8-GPU node by delaying output extraction and unpack/concat work. For the reported 16 input partitions / 4 output partitions case, mean per-rank elapsed time improved from about 207 ms back down to about 172 ms after restoring overlap. This change makes wait() return when local shuffle output is ready to extract, while adding wait_reusable() for callers that need the stronger full-drain guarantee before reusing an op_id.

rapidsai#927 made `Shuffler::wait()` wait until all internal protocol/send cleanup had drained before returning, which introduced a 10-20% regression in `bench_shuffle` on an 8-GPU node by delaying output extraction and unpack/concat work. For the reported 16 input partitions / 4 output partitions case, mean per-rank elapsed time improved from about 207 ms back down to about 172 ms after restoring overlap. This change makes `wait()` return when local shuffle output is ready to extract, while adding `wait_reusable()` for callers that need the stronger full-drain guarantee before reusing an op_id.

…before-protocol-drain

madsbk

Do we need to update the Python bindings as well?

What about renaming to wait_drained()? Or maybe wait_for_drain()?

pentschev self-assigned this May 8, 2026

pentschev requested a review from a team as a code owner May 8, 2026 12:06

pentschev added bug Something isn't working non-breaking Introduces a non-breaking change labels May 8, 2026

Merge remote-tracking branch 'upstream/main' into shuffle-extraction-…

06d0e93

…before-protocol-drain

madsbk reviewed May 8, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow shuffle extraction before protocol drain#1019

Allow shuffle extraction before protocol drain#1019
pentschev wants to merge 2 commits into
rapidsai:mainfrom
pentschev:shuffle-extraction-before-protocol-drain

pentschev commented May 8, 2026

Uh oh!

madsbk left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

pentschev commented May 8, 2026

Uh oh!

madsbk left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants