Skip to content

[ENH] remove prefer="threads" from Parallel in BaseIntervalForest#3361

Open
TonyBagnall wants to merge 5 commits intomainfrom
ajb/drcif
Open

[ENH] remove prefer="threads" from Parallel in BaseIntervalForest#3361
TonyBagnall wants to merge 5 commits intomainfrom
ajb/drcif

Conversation

@TonyBagnall
Copy link
Copy Markdown
Contributor

@TonyBagnall TonyBagnall commented Mar 24, 2026

see #3360 #1797 #2724

Our standard threading is to either use prange in Numba (whole different issue) or have option of passing backend to Parallel, with the default being "threads", e.g.

fit = Parallel(n_jobs=self._n_jobs, prefer="threads")(
                delayed(self._fit_estimator)(
                    X,
                    y,
                    check_random_state(rng.randint(np.iinfo(np.int32).max)),
                    save_transformed_data,
                )
                for _ in range(self._n_estimators)
            )

Parallel actually defaults to loky, and I think in some of use cases loky will be better. Testing DrCIF to complete the multiverse benchmark on a 64 core machine, with current behaviour threading makes hardly any difference to single thread because it does not utilise cores
image
when removing the prefers option, it goes all in
image
it completed the EigenWorms problem in 33 mins with loky, but has been running for 14 hours with prefer="threads". For DrCIF at least, the benefit is clear. Joblib's own documentation states that "the threading backend is mainly effective when the hot part of the workload is in compiled code that releases the GIL; if the workload manipulates Python objects a lot, scaling is limited." this is not the case for DrCIF.

this changes the underlying base class BaseIntervalForest which impacts DrCIF, CIF, IntervalForest, RISE, STSF, TSF, and the interval-based regressors as well. They are all structurally very similar and likely to equally benefit.

There is no real rush for this PR, if desired I can test them all, but it wont happen soon. If we wait, I may tweak a couple of other things in DrCIF and add a fast path too for n_jobs = 1, which would speed up unthreaded, but thats more a follow up PR

@TonyBagnall TonyBagnall added the classification Classification package label Mar 24, 2026
@aeon-actions-bot
Copy link
Copy Markdown
Contributor

Thank you for contributing to aeon

I did not find any labels to add based on the title. Please add the [ENH], [MNT], [BUG], [DOC], [REF], [DEP] and/or [GOV] tags to your pull requests titles. For now you can add the labels manually.
I would have added the following labels to this PR based on the changes made: [ classification ], however some package labels are already present.

The Checks tab will show the status of our automated tests. You can click on individual test runs in the tab or "Details" in the panel below to see more information if there is a failure.

If our pre-commit code quality check fails, any trivial fixes will automatically be pushed to your PR unless it is a draft.

Don't hesitate to ask questions on the aeon Discord channel if you have any.

PR CI actions

These checkboxes will add labels to enable/disable CI functionality for this PR. This may not take effect immediately, and a new commit may be required to run the new configuration.

  • Run pre-commit checks for all files
  • Run mypy typecheck tests
  • Run all pytest tests and configurations
  • Run all notebook example tests
  • Run numba-disabled codecov tests
  • Stop automatic pre-commit fixes (always disabled for drafts)
  • Disable numba cache loading
  • Regenerate expected results for testing
  • Push an empty commit to re-run CI checks

@TonyBagnall TonyBagnall changed the title [ENNH] remove prefer="threads" from Parallel [ENH] remove prefer="threads" from Parallel for DrCIF Mar 24, 2026
@TonyBagnall TonyBagnall changed the title [ENH] remove prefer="threads" from Parallel for DrCIF [ENH] remove prefer="threads" from Parallel for BaseIntervalForest Mar 24, 2026
@TonyBagnall TonyBagnall requested a review from dguijo as a code owner March 24, 2026 19:51
@TonyBagnall TonyBagnall changed the title [ENH] remove prefer="threads" from Parallel for BaseIntervalForest [ENH] remove prefer="threads" from Parallel in BaseIntervalForest Mar 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

classification Classification package

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant