Implement our own to_thread offload for cudf-polars streaming execution#22474
Conversation
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
|
Needs rapidsai/rapidsmpf#1023, marking as do not merge because I don't think this should go to 26.06. |
cf19f57 to
2388a7a
Compare
2388a7a to
c465b8c
Compare
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughThis PR migrates RapidsMPF's thread-offloading pattern from direct ChangesThread Pool Executor Integration in RapidsMPF
🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@python/cudf_polars/cudf_polars/dsl/ir.py`:
- Around line 156-158: Replace the assert in to_thread() that checks
self.py_executor with an explicit runtime check: if self.py_executor is None:
raise RuntimeError("Execution context must have a thread pool for offload") (or
a suitable exception type) so we never pass None into loop.run_in_executor and
silently fall back to the default executor; update the check at the start of
to_thread() (referencing self.py_executor and loop.run_in_executor) accordingly
to ensure the persistent executor is always used.
In `@python/cudf_polars/tests/experimental/test_allgather.py`:
- Around line 51-55: The ThreadPoolExecutor passed into IRExecutionContext is
never closed causing leaked threads; change the test to create the executor with
a context manager or explicitly shut it down after use—e.g., wrap
ThreadPoolExecutor(max_workers=1) in a with block and construct
IRExecutionContext inside it (or call executor.shutdown(wait=True) after
awaiting allgather.extract_concatenated) so the executor backing
IRExecutionContext is properly closed; reference IRExecutionContext and the
ThreadPoolExecutor instance used for ir_context.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: d2339823-9e70-4290-9f51-6043fae6332f
📒 Files selected for processing (11)
python/cudf_polars/cudf_polars/dsl/ir.pypython/cudf_polars/cudf_polars/experimental/rapidsmpf/collectives/allgather.pypython/cudf_polars/cudf_polars/experimental/rapidsmpf/collectives/sort.pypython/cudf_polars/cudf_polars/experimental/rapidsmpf/frontend/core.pypython/cudf_polars/cudf_polars/experimental/rapidsmpf/groupby.pypython/cudf_polars/cudf_polars/experimental/rapidsmpf/io.pypython/cudf_polars/cudf_polars/experimental/rapidsmpf/join.pypython/cudf_polars/cudf_polars/experimental/rapidsmpf/nodes.pypython/cudf_polars/cudf_polars/experimental/rapidsmpf/repartition.pypython/cudf_polars/cudf_polars/experimental/rapidsmpf/utils.pypython/cudf_polars/tests/experimental/test_allgather.py
608f2d2 to
420df99
Compare
|
OK, I did some benchmarking and I think this is positive, and should go in. Needs coordination with the rapidsmpf PR, but please review. |
mroeschke
left a comment
There was a problem hiding this comment.
optional: Would be nice to use a ruff rule disallow asyncio.to_thread like we did with gather here
cudf/python/cudf_polars/pyproject.toml
Line 208 in 1d451cf
TomAugspurger
left a comment
There was a problem hiding this comment.
Looks good, thanks.
Have you thought at all about how to protect against reintroducing this in the future? A couple of options:
- a pre-commit rule that checks for usage of
asyncio.to_threadin cudf-polars. - Some kind of pytest fixture that sets the default executor used by
asyncio.to_threadto raise an exception. But I don't know whether this is feasible since I'm not sure exactly when the event loop is created (maybe not until we callrun_actor_network?)
Good idea, added a ruff rule at Matt's suggestion |
60af723 to
bba9510
Compare
bdice
left a comment
There was a problem hiding this comment.
Approving packaging changes.
ab7188a to
48a243a
Compare
48a243a to
023399d
Compare
asyncio.to_thread always uses the default asyncio thread pool that contains a hardware-dependent number of threads. Although one can set the default executor on an event loop, when the loop exits, the executor is shut down. Since we want the executor thread pool to persist between collect calls we can't do that. Instead, hang an executor on the IRExecutionContext and use the new to_thread method to offload.
023399d to
8aa62db
Compare
|
/merge |
…on (rapidsai#22474) asyncio.to_thread always uses the default asyncio thread pool that contains a hardware-dependent number of threads. Although one can set the default executor on an event loop, when the loop exits, the executor is shut down. Since we want the executor thread pool to persist between collect calls we can't do that. Instead, hang an executor on the IRExecutionContext and use the new to_thread method to offload. Authors: - Lawrence Mitchell (https://github.com/wence-) Approvers: - Matthew Roeschke (https://github.com/mroeschke) - Tom Augspurger (https://github.com/TomAugspurger) - Bradley Dice (https://github.com/bdice) - Mads R. B. Kristensen (https://github.com/madsbk) URL: rapidsai#22474
Description
asyncio.to_thread always uses the default asyncio thread pool that contains a hardware-dependent number of threads. Although one can set the default executor on an event loop, when the loop exits, the executor is shut down. Since we want the executor thread pool to persist between collect calls we can't do that. Instead, hang an executor on the IRExecutionContext and use the new to_thread method to offload.
Checklist