Skip to content

ENH: Add free-threaded Python support#1580

Open
HaoZeke wants to merge 3 commits into
airspeed-velocity:mainfrom
HaoZeke:free-threaded-support
Open

ENH: Add free-threaded Python support#1580
HaoZeke wants to merge 3 commits into
airspeed-velocity:mainfrom
HaoZeke:free-threaded-support

Conversation

@HaoZeke
Copy link
Copy Markdown
Member

@HaoZeke HaoZeke commented Mar 8, 2026

Declare Py_mod_gil with Py_MOD_GIL_NOT_USED so the extension does not force-enable the GIL on free-threaded builds. The module holds no global mutable state (all data lives in per-object or per-module-state structs), so running without the GIL should be safe...

Enable free-threaded wheel builds in cibuildwheel so cp313t and cp314t wheels are published.

Closes #1515.

@HaoZeke
Copy link
Copy Markdown
Member Author

HaoZeke commented Mar 8, 2026

cc @ngoldbaum too for the assessment over in #1515 (nvm the failures they're from #1581)

@HaoZeke HaoZeke force-pushed the free-threaded-support branch from 8398e31 to 1e66268 Compare March 8, 2026 22:32
Copy link
Copy Markdown
Contributor

@mattip mattip left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@rgommers
Copy link
Copy Markdown

rgommers commented Mar 9, 2026

It'd probably be good to drop cp313t now, those wheels are no longer useful - cp314t is way better, and we're encouraging new projects to only publish for that. Also cibuildwheel defaults to adding cp314t but not cp313t.

Comment thread asv/_rangemedian.cpp Outdated
public:
Cache() {};
bool get(size_t left, size_t right, double *mu, double *dist) const {
std::lock_guard<std::mutex> lock(mtx_);
Copy link
Copy Markdown

@ngoldbaum ngoldbaum Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you have an attached thread state here, since you don't detach inside RangeMedian_mu_dist, RangeMedian_mu, or RangeMedian_dist. That means this call can possibly lead to deadlocks with the interpreter on both the GIL-enabled and free-threaded build.

To avoid that, you need to detach from the interpreter before locking native mutexes.

That said, I'm not actually sure whether it makes sense to do that whole dance on the GIL-enabled, given that the GIL is held. So then you'd want to conditionally use locking on the free-threaded build, with #ifdef Py_GIL_DISABLED blocks.

I also suspect that despite the existence of the set method -- which I don't think is used at all from Python -- in practice this is a cache that is initalized once and then thereafter not modified, so using a single initialization API would probably make more sense than adding a mutex to the type, since you'd only have the possibility of blocking calls when the cache is created and not thereafter. Note that the C++ call_once API involves blocking calls, so you need to detach from the interpreter before calling any API that can block.

At a higher level - do you actually want to support sharing RangeMedian objects between threads? A simpler approach would be to not support that, replace the blocking calls to lock here with non-blocking try_lock calls. Then, if acquiring the lock fails, you can bubble up an error to Python saying that another thread is already using the RangeMedian.

Then it becomes the responsibility of the Python programmer to not share RangeMedian objects between threads. I don't think asv does that, so you'll never hit that error in practice until some future point when you want to add multithreaded parallelism using RangeMedian instances.

I'd also definitely add a multithreaded test. I bet if you wrote a multithreaded test that reads and writes to a RangeMedian from more than one thread using this current PR, you'd hit deadlocks. IMO it's always worthwhile to write tests for nontrivial code to handle multithreaded parallelism.

HaoZeke added 3 commits April 19, 2026 20:26
Declare Py_mod_gil with Py_MOD_GIL_NOT_USED so the extension does not
force-enable the GIL on free-threaded builds.

Thread safety audit and fixes:
- Add std::mutex to Cache class so concurrent get/set from multiple
  threads on the same RangeMedian object is safe (per ngoldbaum airspeed-velocity#1515)
- Fix __init__ to free old y/cache before reallocating, preventing a
  memory leak on reinit and a data race if __init__ is called
  concurrently on the same object

The remaining state is safe: y vector is read-only after init,
compute_weighted_median uses only stack locals, find_best_partition
uses stack-local vectors, and module state is set once at import.

Enable free-threaded wheel builds in cibuildwheel so cp313t and cp314t
wheels are published.

Closes airspeed-velocity#1515
@HaoZeke HaoZeke force-pushed the free-threaded-support branch from 1e66268 to 8f8b2bb Compare April 20, 2026 00:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support the free-threaded build and publish free-threaded wheels

4 participants