Add timestep_hours aggregator config option by mcgibbon · Pull Request #883 · ai2cm/ace

mcgibbon · 2026-02-27T20:25:45Z

This PR allows setting timestep_hours on the InferenceEvaluatorAggregatorConfig, which if set to a value other than 6, will modify the step selected for "step_20" metrics.

This does mean "step 20" is a lie in these cases. While not ideal, renaming this metric is also problematic because in coupled modelling we rely on reporting step 20 ocean metrics. This is more problematic than in our daily-step runs (which are still preliminary) having to remember and communicate that "step 20" really means "day 5" (which we generally understand, internally). We should rectify this in a later PR adding more configurability to select the desired lead time.

Changes:

Added timestep_hours to InferenceEvaluatorAggregatorConfig, which if set to a value other than 6, will modify the step selected for "step_20" metrics.
Tests added

…orConfig

oliverwm1

Instead of having the user configure the dataset timestep—which is a property of the dataset—how about having the user configure the number of steps at which the RMSE is computed (which would default to 20, maintaining backwards compatibility)?

oliverwm1 · 2026-03-02T18:04:19Z

fme/ace/aggregator/inference/main.py

+        timestep_hours: Timestep of the data in hours, used for determining
+            which timestep corresponds to 5 days of evolution.


Should we be requiring the user to configure this, when it is really just a property of the dataset? I don't think there is anywhere else in the codebase where this is a user configurable option.

mcgibbon · 2026-03-02T19:26:36Z

Instead of having the user configure the dataset timestep—which is a property of the dataset—how about having the user configure the number of steps at which the RMSE is computed (which would default to 20, maintaining backwards compatibility)?

Hmmm yeah that's a good idea, I didn't like it originally because it's a lot more finnicky/error-prone to be configuring integer step counts like this than timestep lengths. But it would resolve the issue of aligning this change with ocean inference without lying. The only thing I'm still not sure about is the name, as I do want to compare 5-step inference against 20-step inference on their respective runs, and we have places where we explicitly look for the "mean_step_20" name in our code.

mcgibbon · 2026-03-02T19:29:30Z

How about adding a weather_eval_step: int = None configuration option which if not given keeps the mean_step_20 metrics, or if given replaces them with a weather_step set of metrics using the indicated forward step? This keeps the current default behavior, but lets me put new daily-step and 6h-step runs under the same metric.

jpdunc23 · 2026-03-02T20:07:00Z

How about adding a weather_eval_step: int = None configuration option which if not given keeps the mean_step_20 metrics, or if given replaces them with a weather_step set of metrics using the indicated forward step? This keeps the current default behavior, but lets me put new daily-step and 6h-step runs under the same metric.

If not too expensive, I think it would be nice to keep the mean_step_20 metrics even when weather_eval_step is provided.

oliverwm1 · 2026-03-02T20:10:17Z

as I do want to compare 5-step inference against 20-step inference on their respective runs

It is easy enough to make wandb panels comparing two metrics with different names, so I don't think this should be a blocker

mcgibbon · 2026-03-02T20:12:50Z

as I do want to compare 5-step inference against 20-step inference on their respective runs

It is easy enough to make wandb panels comparing two metrics with different names, so I don't think this should be a blocker

Creating ~50 panels, one for each output, is a bit onerous and means I won’t look at this comparison in practice, except by flipping back and forth for a few key variables.

mcgibbon · 2026-03-02T20:16:03Z

If not too expensive, I think it would be nice to keep the mean_step_20 metrics even when weather_eval_step is provided.

What's the use case for this? Aside from the expense, I'd like to avoid doubling the number of wandb keys I need to flip through related to these metrics. Depending on the use case, it also may make more sense to allow weather_eval_steps: list[int] instead.

mcgibbon added 2 commits February 27, 2026 20:18

add timestep_hours configuration option to InferenceEvaluatorAggregat…

bcd7ce6

…orConfig

add test for get_day_5_step

61d607d

mcgibbon marked this pull request as ready for review February 27, 2026 20:28

mcgibbon added 3 commits February 27, 2026 20:34

use correct inequality

231036c

fix diffusion test

499a624

Merge branch 'main' into feature/configure_5_day

a7a7a43

oliverwm1 reviewed Mar 2, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add timestep_hours aggregator config option#883

Add timestep_hours aggregator config option#883
mcgibbon wants to merge 5 commits intomainfrom
feature/configure_5_day

mcgibbon commented Feb 27, 2026 •

edited

Loading

Uh oh!

oliverwm1 left a comment

Uh oh!

oliverwm1 Mar 2, 2026

Uh oh!

mcgibbon commented Mar 2, 2026

Uh oh!

mcgibbon commented Mar 2, 2026

Uh oh!

jpdunc23 commented Mar 2, 2026

Uh oh!

oliverwm1 commented Mar 2, 2026

Uh oh!

mcgibbon commented Mar 2, 2026

Uh oh!

mcgibbon commented Mar 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		timestep_hours: Timestep of the data in hours, used for determining
		which timestep corresponds to 5 days of evolution.

Conversation

mcgibbon commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

oliverwm1 left a comment

Choose a reason for hiding this comment

Uh oh!

oliverwm1 Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

mcgibbon commented Mar 2, 2026

Uh oh!

mcgibbon commented Mar 2, 2026

Uh oh!

jpdunc23 commented Mar 2, 2026

Uh oh!

oliverwm1 commented Mar 2, 2026

Uh oh!

mcgibbon commented Mar 2, 2026

Uh oh!

mcgibbon commented Mar 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mcgibbon commented Feb 27, 2026 •

edited

Loading