You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Our standard threading is to either use prange in Numba (whole different issue) or have option of passing backend to Parallel, with the default being "threads", e.g.
fit=Parallel(n_jobs=self._n_jobs, prefer="threads")(
delayed(self._fit_estimator)(
X,
y,
check_random_state(rng.randint(np.iinfo(np.int32).max)),
save_transformed_data,
)
for_inrange(self._n_estimators)
)
Parallel actually defaults to loky, and I think in some of use cases loky will be better. Testing DrCIF to complete the multiverse benchmark on a 64 core machine, with current behaviour threading makes hardly any difference to single thread because it does not utilise cores
when removing the prefers option, it goes all in
it completed the EigenWorms problem in 33 mins with loky, but has been running for 14 hours with prefer="threads". For DrCIF at least, the benefit is clear. Joblib's own documentation states that "the threading backend is mainly effective when the hot part of the workload is in compiled code that releases the GIL; if the workload manipulates Python objects a lot, scaling is limited." this is not the case for DrCIF.
this changes the underlying base class BaseIntervalForest which impacts DrCIF, CIF, IntervalForest, RISE, STSF, TSF, and the interval-based regressors as well. They are all structurally very similar and likely to equally benefit.
There is no real rush for this PR, if desired I can test them all, but it wont happen soon. If we wait, I may tweak a couple of other things in DrCIF and add a fast path too for n_jobs = 1, which would speed up unthreaded, but thats more a follow up PR
I did not find any labels to add based on the title. Please add the [ENH], [MNT], [BUG], [DOC], [REF], [DEP] and/or [GOV] tags to your pull requests titles. For now you can add the labels manually.
I would have added the following labels to this PR based on the changes made: [ classification ], however some package labels are already present.
The Checks tab will show the status of our automated tests. You can click on individual test runs in the tab or "Details" in the panel below to see more information if there is a failure.
If our pre-commit code quality check fails, any trivial fixes will automatically be pushed to your PR unless it is a draft.
Don't hesitate to ask questions on the aeonDiscord channel if you have any.
PR CI actions
These checkboxes will add labels to enable/disable CI functionality for this PR. This may not take effect immediately, and a new commit may be required to run the new configuration.
Run pre-commit checks for all files
Run mypy typecheck tests
Run all pytest tests and configurations
Run all notebook example tests
Run numba-disabled codecov tests
Stop automatic pre-commit fixes (always disabled for drafts)
TonyBagnall
changed the title
[ENNH] remove prefer="threads" from Parallel
[ENH] remove prefer="threads" from Parallel for DrCIF
Mar 24, 2026
TonyBagnall
changed the title
[ENH] remove prefer="threads" from Parallel for DrCIF
[ENH] remove prefer="threads" from Parallel for BaseIntervalForest
Mar 24, 2026
TonyBagnall
changed the title
[ENH] remove prefer="threads" from Parallel for BaseIntervalForest
[ENH] remove prefer="threads" from Parallel in BaseIntervalForest
Mar 24, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
see #3360 #1797 #2724
Our standard threading is to either use prange in Numba (whole different issue) or have option of passing backend to Parallel, with the default being "threads", e.g.
Parallel actually defaults to loky, and I think in some of use cases loky will be better. Testing DrCIF to complete the multiverse benchmark on a 64 core machine, with current behaviour threading makes hardly any difference to single thread because it does not utilise cores


when removing the prefers option, it goes all in
it completed the EigenWorms problem in 33 mins with loky, but has been running for 14 hours with prefer="threads". For DrCIF at least, the benefit is clear. Joblib's own documentation states that "the threading backend is mainly effective when the hot part of the workload is in compiled code that releases the GIL; if the workload manipulates Python objects a lot, scaling is limited." this is not the case for DrCIF.
this changes the underlying base class BaseIntervalForest which impacts DrCIF, CIF, IntervalForest, RISE, STSF, TSF, and the interval-based regressors as well. They are all structurally very similar and likely to equally benefit.
There is no real rush for this PR, if desired I can test them all, but it wont happen soon. If we wait, I may tweak a couple of other things in DrCIF and add a fast path too for n_jobs = 1, which would speed up unthreaded, but thats more a follow up PR