Skip to content

Combine distribution fit and bootstrap CI into a single apply_ufunc pass in metric_calc 1-in-X #756

@neilSchroeder

Description

@neilSchroeder

Context

PR #753 added bootstrap-based confidence intervals to the 1-in-X return value analysis in climakitae/new_core/processors/metric_calc.py. As currently implemented, the CI path invokes xr.apply_ufunc twice per simulation batch (and per spatial chunk in the batched path): once for the point-estimate distribution fit, and a second time for the bootstrap CIs.

This works correctly but does redundant orchestration work — coord alignment, dtype promotion, and the full vectorize=True traversal of the grid happen twice.

Proposal

Collapse the two passes into one by introducing a combined inner helper that returns (return_data, ci_lower, ci_upper, p_value) per 1-D block-maxima series, and call apply_ufunc once with four output core dims.

Sketch:

```python
def _fit_and_conf_int_1d(
self,
block_maxima_1d,
*,
return_periods=UNSET,
return_values=UNSET,
distr="gev",
block_size=1,
extremes_type="max",
get_p_value=False,
compute_conf_int=False,
bootstrap_runs=100,
conf_int_lower_bound=2.5,
conf_int_upper_bound=97.5,
):
return_data, p_value = self._fit_return_variable_1d(...)
if not compute_conf_int:
nan_like = np.full_like(return_data, np.nan, dtype=float)
return return_data, nan_like, nan_like, p_value
ci_lower, ci_upper = self._conf_int(...)
return return_data, ci_lower, ci_upper, p_value
```

Then in `_fit_distributions_vectorized` and `_fit_with_early_spatial_batching`, replace the two `apply_ufunc` calls with a single call using:

```python
output_core_dims=[["one_in_x"], ["one_in_x"], ["one_in_x"], []]
```

Why

  • One pass over the grid instead of two — removes duplicated orchestration cost, which is meaningful at d03 resolution × many sims × 100 bootstrap runs.
  • Point estimates and bootstrap draws share the same fit path, so they can't drift in the future.
  • `_bootstrap` and `_conf_int` remain intact as tested units; only the dispatch changes.

Scope

Out of scope for PR #753 — tracked here as follow-up performance/refactor work.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions