Skip to content

feat: add cadcat_preview catalog for sup3rcc preview datasets#757

Draft
neilSchroeder wants to merge 1 commit into
mainfrom
feature/cadcat-preview-catalog
Draft

feat: add cadcat_preview catalog for sup3rcc preview datasets#757
neilSchroeder wants to merge 1 commit into
mainfrom
feature/cadcat-preview-catalog

Conversation

@neilSchroeder
Copy link
Copy Markdown
Collaborator

@neilSchroeder neilSchroeder commented Apr 24, 2026

Summary of changes and related issue

Adds a new 'cadcat_preview' catalog alongside cadcat, renewables, and hdp for internal/pre-release datasets. First member is sup3rcc (NREL super- resolved 4km CONUS data; EC-Earth3-CC historical + ssp245, 13 variables).

  • New CATALOG_CADCAT_PREVIEW constant and CADCAT_PREVIEW_CATALOG_URL path
  • Bundled intake-esm collection (cae-preview-collection.json + cae-preview-zarr.csv) under climakitae/data/; resolved via _package_file_path at load time. The URL may later be flipped to an s3:// URL so AWS credentials gate access.
  • DataCatalog loads preview with silent failure (logger.debug) so users without access to internal data see no warnings; adds .preview property and merge_catalogs inclusion mirroring HDP/renewables patterns.
  • New CadcatPreviewValidator opts out of warming_level, bias_adjust_model_to_station, and filter_unadjusted_models processors (cadcat/WRF/LOCA2-specific). Uses the same activity_id/table_id/grid_label/variable_id required keys.
  • variable_descriptions.csv: 9 sup3r-unique vars (swddif, swddni, swdnb, wd10/100/200, ws10/100/200). Shared vars (t2, prec, psfc, rh) reuse existing Dynamical rows via first-match lookup semantics.
  • 15 new validator tests + updated data_access tests for the 4th catalog.

Testing

  • Unit tests written for new/modified code (goal: 80% coverage)
    • All public functions have unit tests
    • Functions that must produce specific values have unit tests
  • Verified that notebooks utilizing affected functions still work
  • Appropriate manual testing completed
  • Advanced Testing label added to PR if this PR makes major changes to the codebase

How to Test

sup3r_preview_demo.ipynb
Download this notebook ^, checkout this branch and install locally with pip install -e ., then play around with loading and manipulating the Sup3rCC dataset with climakitae.

Documentation

  • Complex code includes comments explaining logic
  • Function documentation created (docstrings)

Code Quality

  • Follows PEP8 naming and style conventions
  • Helper functions prefixed with underscore _
  • Linting completed and all issues resolved
  • Does not replicate existing functionality
  • Aligns with general coding standards of existing codebase
  • Code generalized for multiple uses (unless too complex/time-intensive)

Review Process

  • PR review instructions provided:
    • Type of review requested (scientific/technical/debugging)
    • Level of review detail needed

Administrative Reminders

  • Jira ticket moved to "Review" when PR created
  • Jira ticket will be moved to "Done" when complete
  • PR branch will be deleted after merge

Adds a new 'cadcat_preview' catalog alongside cadcat, renewables, and hdp
for internal/pre-release datasets. First member is sup3rcc (NREL super-
resolved 4km CONUS data; EC-Earth3-CC historical + ssp245, 13 variables).

- New CATALOG_CADCAT_PREVIEW constant and CADCAT_PREVIEW_CATALOG_URL path
- Bundled intake-esm collection (cae-preview-collection.json + cae-preview-zarr.csv)
  under climakitae/data/; resolved via _package_file_path at load time.
  The URL may later be flipped to an s3:// URL so AWS credentials gate access.
- DataCatalog loads preview with silent failure (logger.debug) so users
  without access to internal data see no warnings; adds .preview property
  and merge_catalogs inclusion mirroring HDP/renewables patterns.
- New CadcatPreviewValidator opts out of warming_level, bias_adjust_model_to_station,
  and filter_unadjusted_models processors (cadcat/WRF/LOCA2-specific).
  Uses the same activity_id/table_id/grid_label/variable_id required keys.
- variable_descriptions.csv: 9 sup3r-unique vars (swddif, swddni, swdnb,
  wd10/100/200, ws10/100/200). Shared vars (t2, prec, psfc, rh) reuse
  existing Dynamical rows via first-match lookup semantics.
- 15 new validator tests + updated data_access tests for the 4th catalog.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant