Skip to content

Get all workers from scheduler info#1514

Merged
rapids-bot[bot] merged 1 commit into
rapidsai:branch-25.08from
pentschev:get-all-workers-from-scheduler-info
Jul 1, 2025
Merged

Get all workers from scheduler info#1514
rapids-bot[bot] merged 1 commit into
rapidsai:branch-25.08from
pentschev:get-all-workers-from-scheduler-info

Conversation

@pentschev
Copy link
Copy Markdown
Member

For no good reason other than making Jupyter pretty printing nicer, dask/distributed#9045 broke the API of client.scheduler_info() by changing the default number of workers that are retrieved to 5.

We use this information in various places to ensure the correct number of workers have connected to the scheduler, and thus this change now specifies n_workers=-1 to get all workers instead of just the first 5. This happened not to be seen in CI previously because we don't run multi-GPU tests, let alone with more than 5 workers. Running various tests from test_dask_cuda_worker.py on a system with 8 GPUs will cause them to timeout while waiting for the workers to connect.

For no good reason other than making Jupyter pretty printing nicer,
dask/distributed#9045 broke the API of
`client.scheduler_info()` by changing the default number of workers that
are retrieved to `5`.

We use this information in various places to ensure the correct number
of workers have connected to the scheduler, and thus this change now
specifies `n_workers=-1` to get all workers instead of just the first
`5`. This happened not to be seen in CI previously because we don't run
multi-GPU tests, let alone with more than `5` workers. Running various
tests from `test_dask_cuda_worker.py` on a system with 8 GPUs will cause
them to timeout while waiting for the workers to connect.
@pentschev pentschev self-assigned this Jul 1, 2025
@pentschev pentschev requested a review from a team as a code owner July 1, 2025 12:10
@pentschev pentschev added bug Something isn't working 3 - Ready for Review Ready for review by team non-breaking Non-breaking change labels Jul 1, 2025
@github-actions github-actions Bot added the python python code needed label Jul 1, 2025
@pentschev
Copy link
Copy Markdown
Member Author

Thanks @jacobtomlinson !

@pentschev
Copy link
Copy Markdown
Member Author

/merge

@rapids-bot rapids-bot Bot merged commit 5bcd3ce into rapidsai:branch-25.08 Jul 1, 2025
31 checks passed
@pentschev pentschev deleted the get-all-workers-from-scheduler-info branch July 4, 2025 08:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

3 - Ready for Review Ready for review by team bug Something isn't working non-breaking Non-breaking change python python code needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants