Plumb a bulk_open_max_parallelism option. by folded · Pull Request #7 · google-deepmind/bagz

folded · 2026-05-01T04:24:54Z

Plumbs a bulk_open_max_parallelism option for bulk shard opens through the API to filesystem
drivers.

Drivers can set their own defaults, and GCS uses this to lower the default to 32 (while keeping
read parallelism at 100 thread). Empirically this was appropriate for a within-zone bulk open
(1334 shards, 40M records).

On accesses with a longer RTT there is still a benefit to higher parallelism, but it comes at
the cost of a lot of DNS lookup churn within libcurl. On MacOS, this causes higher peak fd
usage, leading to opaque curl address resolution failures, lots of retries with slow backoff,
and stochastic failures if retry budget exhausts.

The open phase has different concurrency economics from the read phase: * Each file open issues a fresh metadata request that requires DNS resolution. On macOS, libcurl uses a per-call threaded resolver (CURLRES_THREADED) — every getaddrinfo call spawns a fresh pthread for the lookup. Under bursts of concurrent opens at N=100 (the previous hard-coded fan-out), pthread_create() can return EAGAIN, libcurl emits "getaddrinfo() thread failed to start" and surfaces the failure to callers as CURLE_FAILED_INIT (curl error 2). The error is stochastic but reproducible at N >= ~117 shards on a default-configured macOS host. * The opens are GCS-latency bound; past ~16-32 in flight there is no further wall-clock benefit, only resolver-thread pressure. * The read phase reuses connections from the GCS client's pool, so it does not create resolver threads at the same rate and can safely run with the existing max_parallelism=100 default. This commit splits the two by: * Adding `Options::bulk_open_max_parallelism` (default 16) alongside the existing `max_parallelism` (default 100, retained for the read path). * Plumbing the new parameter through `file::BulkOpenPRead`, `FileSystem::BulkOpenPRead`, and the GCS / POSIX overrides + mock. * Routing `BagzReader::Open` to pass `options.bulk_open_max_parallelism` to `file::BulkOpenPRead`. * Exposing the option in the Python binding (kwargs and attribute). Values <= 0 mean "use the file-system default" (100), preserving the prior behaviour for any caller not passing the option. Trace evidence of the failure mode (libcurl 8.19.0, macOS arm64, captured via DYLD_INSERT_LIBRARIES interpose to force CURLOPT_VERBOSE): * getaddrinfo() thread failed to start * Could not resolve host: storage.googleapis.com * closing connection #N 15 thread-spawn failures in a single run × cascading retries until google-cloud-cpp's retry budget exhausts and one operation surfaces as the permanent error to the caller.

A parallelism sweep on a same-region GCE n2-standard-8 (1334 shards, 5 iters per setting) found: p min p50 p95 16 2.23s 2.25s 2.34s 32 1.67s 1.69s 1.71s <- floor 64 2.21s 2.23s 2.26s 100 2.31s 2.33s 4.09s p=32 is ~25% faster than p=16 and ~25% faster than p=64. Beyond 32 the curve regresses — connection-pool / libcurl-cache contention dominates the residual RTT savings. 16 was a conservative first guess; the data says we have headroom. Cross-region clients (~140ms RTT macOS->australia-southeast1) still prefer higher parallelism (the latency masks the worker-overhead cost), but 32 is within ~1s of optimum on a 1334-shard open and stays clear of the macOS pthread_create EAGAIN window that fires around p=64+.

The 32-thread cap is justified by GCS-specific behaviour (DNS-resolution saturation past ~32 in flight, and fd pressure under burst load), so it belongs on the GCS backend rather than as a bagz-level default that silently constrains the posix path too. Bagz `bulk_open_max_parallelism` now defaults to 0 ("use the file-system default"), and `GcsFileSystem` picks 32 via a new `kDefaultBulkOpenMaxParallelism` alongside the existing `kDefaultMaxParallelism = 100` for reads.

folded added 3 commits April 30, 2026 22:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Plumb a bulk_open_max_parallelism option.#7

Plumb a bulk_open_max_parallelism option.#7
folded wants to merge 3 commits into
google-deepmind:mainfrom
folded:fix/bulk-open-parallelism

folded commented May 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

folded commented May 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant