From fe1a6c6a29ca7ba5ec51565b5536341a468744bc Mon Sep 17 00:00:00 2001 From: Aaron Spring Date: Mon, 23 Feb 2026 16:59:41 +0100 Subject: [PATCH 1/3] Add CLAUDE.md via symlink to AGENTS.md with updated dependencies --- AGENTS.md | 214 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ CLAUDE.md | 1 + 2 files changed, 215 insertions(+) create mode 100644 AGENTS.md create mode 120000 CLAUDE.md diff --git a/AGENTS.md b/AGENTS.md new file mode 100644 index 00000000..977ea091 --- /dev/null +++ b/AGENTS.md @@ -0,0 +1,214 @@ +# CLAUDE.md + +This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. + +## Project Overview + +**xskillscore** is a Python package for computing forecast verification metrics using xarray. It provides both deterministic and probabilistic forecast verification metrics designed to work with multi-dimensional labeled arrays, with support for Dask parallel computing. + +Originally developed to parallelize forecast metrics for multi-model-multi-ensemble forecasts in the SubX project. + +## Development Commands + +### Testing + +Run full test suite: +```bash +pytest -n 4 --cov=xskillscore --cov-report=xml --verbose +``` + +Run tests for a single file: +```bash +pytest xskillscore/tests/test_deterministic.py +``` + +Run a specific test: +```bash +pytest xskillscore/tests/test_deterministic.py::test_pearson_r -v +``` + +Run tests with specific markers: +```bash +pytest -m "not slow" # Skip slow tests +pytest -m "not network" # Skip tests requiring network +``` + +### Doctests + +Run doctests on all modules: +```bash +python -m pytest --doctest-modules xskillscore --ignore xskillscore/tests +``` + +### Code Quality + +Run pre-commit checks: +```bash +pre-commit run --all-files +``` + +Linting and formatting (via ruff): +```bash +ruff check --fix . +ruff format . +``` + +Type checking: +```bash +mypy xskillscore +``` + +### Documentation + +Build documentation: +```bash +cd docs +make html +``` + +Test notebooks in documentation: +```bash +cd docs +nbstripout source/*.ipynb +make -j4 html +``` + +### Installation + +Install in development mode: +```bash +pip install -e . +``` + +Install with test dependencies: +```bash +pip install -e ".[test]" +``` + +Install with all dependencies: +```bash +pip install -e ".[complete]" +``` + +## Architecture + +### Core Module Structure + +The `xskillscore/core/` directory contains the main implementation: + +- **deterministic.py**: Deterministic forecast metrics (pearson_r, rmse, mae, mse, etc.) +- **probabilistic.py**: Probabilistic metrics (crps_*, brier_score, rps, rank_histogram, etc.) +- **comparative.py**: Comparative tests (sign_test, halfwidth_ci_test) +- **stattests.py**: Statistical tests (multipletests) +- **contingency.py**: Contingency table class and categorical metrics +- **resampling.py**: Resampling and bootstrapping utilities +- **accessor.py**: xarray accessor (`ds.xs.metric()`) for convenient API +- **utils.py**: Shared utilities for preprocessing dimensions, weights, and broadcasting +- **np_deterministic.py**: NumPy implementations of deterministic metrics +- **np_probabilistic.py**: NumPy implementations of probabilistic metrics +- **types.py**: Type definitions + +### Key Design Patterns + +1. **xarray.apply_ufunc Pattern**: All metrics use `xr.apply_ufunc` to: + - Apply NumPy implementations to xarray objects + - Handle broadcasting automatically + - Enable Dask parallelization with `dask="parallelized"` + - Preserve attributes with `keep_attrs` parameter + +2. **Dimension Preprocessing**: Metrics follow this pattern: + ```python + dim, axis = _preprocess_dims(dim, a) # Convert dim to list and axis tuple + a, b = xr.broadcast(a, b, exclude=dim) # Broadcast arrays + a, b, new_dim, weights = _stack_input_if_needed(a, b, dim, weights) # Stack multi-dims + weights = _preprocess_weights(a, dim, new_dim, weights) # Normalize weights + ``` + +3. **Separation of xarray and NumPy logic**: + - High-level functions in `deterministic.py`/`probabilistic.py` handle xarray objects + - Low-level functions in `np_deterministic.py`/`np_probabilistic.py` contain pure NumPy logic + - This enables easier testing and reuse + +4. **Optional Weights**: Most metrics support optional `weights` parameter matching the dimensions being reduced. + +5. **Member Dimension Convention**: Probabilistic metrics use `member_dim="member"` by default for ensemble dimensions. + +### xarray Accessor + +Users can access metrics via the `.xs` accessor on xarray Datasets: +```python +ds = xr.Dataset({"a": a_dataarray, "b": b_dataarray}) +result = ds.xs.pearson_r("a", "b", dim="time") +``` + +The accessor handles converting string variable names to actual DataArrays. + +### Testing Infrastructure + +- **conftest.py**: Centralized pytest fixtures for test data (times, lats, lons, members, etc.) +- Test fixtures provide consistent test data across test modules +- Fixtures include regular data, NaN-masked data, dask-chunked data, and 1D timeseries +- Use `np.random.seed(42)` in doctests for deterministic examples + +## Important Considerations + +### Temporal Metrics + +Some metrics are specifically designed for temporal dimensions: +- `effective_sample_size()`, `pearson_r_eff_p_value()`, `spearman_r_eff_p_value()` +- These raise warnings if applied to non-"time" dimensions +- They account for autocorrelation and should only be used on time series + +### NumPy Version Compatibility + +The codebase supports both numpy<2.0 and numpy>=2.0. When using NumPy functions: +- Use try/except for imports that changed between versions +- Example: `trapezoid` (new) vs `trapz` (old) + +### Dimension Handling + +- `dim=None` means reduce over all dimensions +- `dim` can be a string or list of strings +- When multiple dimensions are provided, they are stacked into a single dimension internally +- The `member` dimension in probabilistic forecasts is special and should not be included in `dim` + +### NaN Handling + +- Most metrics support `skipna` parameter (default: False) +- Probabilistic metrics use `_keep_nans_masked()` to preserve NaN patterns from inputs + +### Dask Support + +All metrics support Dask arrays via `dask="parallelized"` in `xr.apply_ufunc`. No special handling needed when adding new metrics. + +## Python Support + +- Minimum Python version: 3.9 +- Supported versions: 3.9, 3.10, 3.11, 3.12, 3.13 + +## Key Dependencies + +- xarray >= 2023.4.0 (core data structure) +- numpy >= 1.25 +- scipy >= 1.10 +- dask[array] >= 2023.4.0 (parallel computing) +- properscoring (probabilistic metrics) +- xhistogram >= 0.3.2 (histogram computations) +- statsmodels (statistical tests) + +Optional acceleration: +- bottleneck (faster NaN operations) +- numba >= 0.57 (JIT compilation) + +## Contributing Workflow + +1. Create a new branch for your feature +2. Make changes and add tests in `xskillscore/tests/` +3. Add docstring examples (they are tested via doctest) +4. Run `pre-commit run --all-files` before committing +5. Ensure tests pass: `pytest -n 4` +6. Ensure doctests pass: `python -m pytest --doctest-modules xskillscore --ignore xskillscore/tests` +7. Update CHANGELOG.rst if appropriate +8. Submit PR to main branch + +Note: CI includes tests on multiple Python versions, doctest validation, and notebook execution in docs. diff --git a/CLAUDE.md b/CLAUDE.md new file mode 120000 index 00000000..47dc3e3d --- /dev/null +++ b/CLAUDE.md @@ -0,0 +1 @@ +AGENTS.md \ No newline at end of file From 71806d5cb270c64dd773588a03d76822c2ada941 Mon Sep 17 00:00:00 2001 From: Aaron Spring Date: Mon, 23 Feb 2026 17:03:41 +0100 Subject: [PATCH 2/3] Update to reference AI coding agents and add climpred --- AGENTS.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/AGENTS.md b/AGENTS.md index 977ea091..08677894 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -1,6 +1,6 @@ # CLAUDE.md -This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. +This file provides guidance to AI coding agents when working with code in this repository. ## Project Overview @@ -8,6 +8,8 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co Originally developed to parallelize forecast metrics for multi-model-multi-ensemble forecasts in the SubX project. +**Related Projects**: [climpred](https://github.com/pangeo-data/climpred) is a key consumer of xskillscore, providing higher-level prediction skill assessment workflows. + ## Development Commands ### Testing From 0054646079421b4502a9db14e1a37712950ccb6d Mon Sep 17 00:00:00 2001 From: Aaron Spring Date: Mon, 23 Feb 2026 17:04:55 +0100 Subject: [PATCH 3/3] Use pytest -n auto instead of fixed core count --- .github/workflows/xskillscore_testing.yml | 2 +- AGENTS.md | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/.github/workflows/xskillscore_testing.yml b/.github/workflows/xskillscore_testing.yml index 2058f915..cc1418f6 100644 --- a/.github/workflows/xskillscore_testing.yml +++ b/.github/workflows/xskillscore_testing.yml @@ -57,7 +57,7 @@ jobs: micromamba install numpy==1.24 - name: Run tests run: | - pytest -n 4 --cov=xskillscore --cov-report=xml --verbose + pytest -n auto --cov=xskillscore --cov-report=xml --verbose - name: Upload coverage to codecov uses: codecov/codecov-action@v5 with: diff --git a/AGENTS.md b/AGENTS.md index 08677894..a65a9ed3 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -16,7 +16,7 @@ Originally developed to parallelize forecast metrics for multi-model-multi-ensem Run full test suite: ```bash -pytest -n 4 --cov=xskillscore --cov-report=xml --verbose +pytest -n auto --cov=xskillscore --cov-report=xml --verbose ``` Run tests for a single file: @@ -208,7 +208,7 @@ Optional acceleration: 2. Make changes and add tests in `xskillscore/tests/` 3. Add docstring examples (they are tested via doctest) 4. Run `pre-commit run --all-files` before committing -5. Ensure tests pass: `pytest -n 4` +5. Ensure tests pass: `pytest -n auto` 6. Ensure doctests pass: `python -m pytest --doctest-modules xskillscore --ignore xskillscore/tests` 7. Update CHANGELOG.rst if appropriate 8. Submit PR to main branch