Conversation
oliverwm1
left a comment
There was a problem hiding this comment.
Instead of having the user configure the dataset timestep—which is a property of the dataset—how about having the user configure the number of steps at which the RMSE is computed (which would default to 20, maintaining backwards compatibility)?
| timestep_hours: Timestep of the data in hours, used for determining | ||
| which timestep corresponds to 5 days of evolution. |
There was a problem hiding this comment.
Should we be requiring the user to configure this, when it is really just a property of the dataset? I don't think there is anywhere else in the codebase where this is a user configurable option.
Hmmm yeah that's a good idea, I didn't like it originally because it's a lot more finnicky/error-prone to be configuring integer step counts like this than timestep lengths. But it would resolve the issue of aligning this change with ocean inference without lying. The only thing I'm still not sure about is the name, as I do want to compare 5-step inference against 20-step inference on their respective runs, and we have places where we explicitly look for the "mean_step_20" name in our code. |
|
How about adding a |
If not too expensive, I think it would be nice to keep the |
It is easy enough to make wandb panels comparing two metrics with different names, so I don't think this should be a blocker |
Creating ~50 panels, one for each output, is a bit onerous and means I won’t look at this comparison in practice, except by flipping back and forth for a few key variables. |
What's the use case for this? Aside from the expense, I'd like to avoid doubling the number of wandb keys I need to flip through related to these metrics. Depending on the use case, it also may make more sense to allow |
This PR allows setting timestep_hours on the InferenceEvaluatorAggregatorConfig, which if set to a value other than 6, will modify the step selected for "step_20" metrics.
This does mean "step 20" is a lie in these cases. While not ideal, renaming this metric is also problematic because in coupled modelling we rely on reporting step 20 ocean metrics. This is more problematic than in our daily-step runs (which are still preliminary) having to remember and communicate that "step 20" really means "day 5" (which we generally understand, internally). We should rectify this in a later PR adding more configurability to select the desired lead time.
Changes:
Added timestep_hours to InferenceEvaluatorAggregatorConfig, which if set to a value other than 6, will modify the step selected for "step_20" metrics.
Tests added