Skip to content

Support NamedTuple-valued diagnostics#172

Open
haakon-e wants to merge 1 commit into
mainfrom
he/support-namedtuple-valued
Open

Support NamedTuple-valued diagnostics#172
haakon-e wants to merge 1 commit into
mainfrom
he/support-namedtuple-valued

Conversation

@haakon-e
Copy link
Copy Markdown
Member

Purpose

Lets a compute function that returns a Field of NamedTuples be registered as a single DiagnosticVariable and fan out into one scalar NetCDF variable per component, automatically. Motivated by an upcoming CloudMicrophysics debug mode in ClimaAtmos, which produces per-process tendencies as a NamedTuple at every grid point — previously this required N hand-written compute callbacks (one per process) and N DiagnosticVariable registrations to surface in NetCDF.

Example

A compute returning a Field{@NamedTuple{cond, evap, accr}}:

compute_micro(state, cache, time) = cache.microphysics_tendencies

DiagnosticVariable(;
    short_name = "clw_tend",     # stem, not the variable name
    long_name  = "cloud liquid water tendency",
    units      = "kg kg⁻¹ s⁻¹",  # or per-component:
                                  # (; cond = ..., evap = ..., accr = ...)
    compute    = compute_micro,
)

yields three NetCDF variables in a single file — clw_tend_cond, clw_tend_evap, clw_tend_accr (with long_name "cloud liquid water tendency (cond)", etc.). Nested NamedTuples flatten recursively: @NamedTuple{warm::@NamedTuple{au, ac}, tot} with short_name = "proc" gives proc_warm_au, proc_warm_ac, proc_tot.

Scalar diagnostics, reductions (incl. time averaging), and the HDF5Writer / DictWriter paths are unchanged. See the new NamedTuple diagnostics docs page for the full API.

Content

  • Per-component fan-out at the NetCDFWriter boundary (scalar remapper reused, one shared time index per write)
  • Per-component units accepting String | NamedTuple | Function
  • New docs page + NEWS entry, bump to v0.3.6
  • Tests: NetCDF fan-out (flat / nested / scalar), per-component units, PointSpace, HDF5/Dict pass-through, end-to-end NamedTuple + reduction through DiagnosticsHandler, units resolvers

  • I have read and checked the items on the review checklist.

A diagnostic whose `compute`/`compute!` returns a `Field` of `NamedTuple`s
is now written as one scalar NetCDF variable per (possibly nested) leaf,
named `<short_name>_<chain...>`. A single compute function (e.g.
CloudMicrophysics debug-mode process tendencies) fans out into per-process
diagnostics automatically.

The split happens only at the writer boundary: storage, accumulators and
reductions already broadcast per-leaf through ClimaCore's `RecursiveApply`.
The remapper is scalar-only, so the NetCDF writer iterates
`Fields.property_chains` and feeds zero-copy `single_field` views through
the existing remap path. Triggered by `eltype(field) <: NamedTuple`, so
scalar and `Geometry.AxisVector` diagnostics are unaffected. `HDF5Writer`
and `DictWriter` store the field whole.

`DiagnosticVariable(; units = ...)` now also accepts a `NamedTuple`
(per-component) or a `Function(property_chain)`; a plain `String` is
applied to every component (unchanged). Misspecified components emit a
one-time warning at handler construction; an explicit `""` is silent.
`component_units` guards on `key isa Symbol` so an `Int` chain element
(a `Tuple`/`NTuple` index) cannot positional-index the units `NamedTuple`.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds first-class support for diagnostics whose compute function returns a Field of NamedTuples, enabling automatic fan-out into one scalar NetCDF variable per component while keeping other writer paths unchanged.

Changes:

  • Implement NetCDFWriter fan-out for NamedTuple-valued fields (including nested NamedTuples) with shared time-index handling across components.
  • Extend DiagnosticVariable.units to support per-component resolution via NamedTuple or property_chain -> units function, and propagate this to writers (NetCDF per-leaf; HDF5 flattened string).
  • Add docs + NEWS + version bump and comprehensive tests for NetCDF/HDF5/Dict behavior and end-to-end reductions.

Reviewed changes

Copilot reviewed 11 out of 12 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
test/writers.jl Adds integration-style tests for NetCDF fan-out (flat/nested), PointSpace behavior, and HDF5/Dict pass-through.
test/diagnostics.jl Adds end-to-end handler-driven reduction test for NamedTuple-eltype diagnostics.
test/diagnostic_variable.jl Adds unit-resolution tests for per-component units helpers and non-string units acceptance.
src/Writers.jl Imports unit-resolution helpers for use by writer implementations.
src/netcdf_writer.jl Implements NamedTuple component splitting, per-component metadata, and shared-time-index write logic.
src/hdf5_writer.jl Writes variable_units via new flattened per-component units rendering.
src/DiagnosticVariables.jl Broadens units type and introduces component_units, units_attribute, and units_warnings.
src/clima_diagnostics.jl Emits one-time missing-units warnings for NamedTuple diagnostics; uses new PointSpace prealloc helper.
Project.toml Bumps version to v0.3.6.
NEWS.md Documents the new NamedTuple diagnostics feature.
docs/src/namedtuple_diagnostics.md New user docs page explaining fan-out semantics, naming, and per-component units.
docs/make.jl Adds the new docs page to the documentation navigation.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

component_units(units::AbstractString, _) = units
component_units(units, property_chain) = units(property_chain)
function component_units(units::NamedTuple, property_chain)
isempty(property_chain) && return units
Comment on lines +285 to +290
"""
units_attribute(units::AbstractString, _) = units
function units_attribute(units, property_chains)
labeled = map(property_chains) do chain
u = something(component_units(units, chain), "")
"$(join(string.(chain), '.')): $u"
Comment thread NEWS.md
Comment on lines +24 to +27
`units` (and `long_name`) may now be specified per component, either as a
`NamedTuple` mapping each component to its units, or as a function
`property_chain -> units`. A plain `String` is still applied to every
component.
Comment thread src/netcdf_writer.jl
diagnostic,
var.short_name,
output_long_name(diagnostic),
string(var.units),
@ph-kev
Copy link
Copy Markdown
Member

ph-kev commented May 20, 2026

I have only skimmed though this PR, but I do not think this functionality should be supported. This introduces extra complexity for future development of ClimaDiagnostics, since developers would need to worry about this case as well. There is also additional cost to maintain this as well. I am also not sure if the proposed solution to your problem is the best here.

Also, this is going to break ClimaAnalysis.SimDir since it expects only one variable per NetCDF file.

If the problem is hand writing N compute callbacks and N DiagnosticVariable registrations, I would write a macro and metaprogramming with @eval for that. For example, ClimaLand defines this macro for simple compute functions. I am not sure how to complicated the compute functions are for CloudMicrophysics, but if they are Fields of NamedTuples, I would imagine this would work well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants