Add smaller fixes to make running from files more robust#15
Add smaller fixes to make running from files more robust#15JoachimKoenigslieb wants to merge 3 commits into
Conversation
…nt skip out on forecasts on the borders of the domain near sunset/sunrise
|
Thanks a lot @JoachimKoenigslieb, this is awesome! |
|
In general, I think that support for reading in data from GRIB files would also be nice. Is it correctly understood, that this would require only |
| {month}: Two-digit month (e.g. 03) | ||
| {day}: Two-digit day (e.g. 09) | ||
| {hour}: Two-digit hour (e.g. 12) | ||
| {minute}: Two-digit minute (e.g. 30) |
There was a problem hiding this comment.
Perhaps, we could add second as well here and below, to have full flexibility?
There was a problem hiding this comment.
Sounds good. Will add this! (probably next week tough :)
Regarding if all that is needed is cfgrib to read grib files: Yes, but this dependency is also quite hefty. I think it brings in eccodes itself as far as I understand.
For pvlib, this was the only way (from not much looking admittedly) of getting a field of clearsky values. Other methods in pvlib wanted some position tuples which would probably be quite slow I think.
There was a problem hiding this comment.
Good point with cfgrib - no need to make the suite unnecessarily big!
And fair with the clearsky methods - I was mainly curious
| for lon in (lon_min, lon_max) | ||
| ) | ||
| logger.info(f"Maximum corner solar elevation: {solar_elevation:.2f} degrees") | ||
| if solar_elevation < 1: |
There was a problem hiding this comment.
I wonder, if the value of 1 degree should be increased a bit now that we verify the corner with the maximum elevation. For instance, the MSG-CPP dataset is only available for solar elevation greater than 6 degrees. Is 1 degree useful for your applications @JoachimKoenigslieb ?
There was a problem hiding this comment.
Below 6 degrees for maximum corner solar elevation, running with MSG-CPP input fails with:
Nowcast failed for 2026-06-29T02:00:00Z: precip contains only non-finite values
So if lower values than that are needed, it would need to be modifiable (which may be the most elegant solution) by either an argument or a environment variable.
|
|
||
| from .geospatial import get_coordinates | ||
|
|
||
| PVLIB_CLEARSKY_VARIABLE_NAME = "pvlib_clearsky_ghi" |
There was a problem hiding this comment.
I suggest moving this to config.py and importing from that, both here and in main.py. I just tested that and it seems to work without an issue.
|
The command: |
|
|
||
| fields = [] | ||
| for time in times: | ||
| repeated_times = pd.DatetimeIndex([time] * len(flat_latitudes)) |
There was a problem hiding this comment.
| repeated_times = pd.DatetimeIndex([time] * len(flat_latitudes)) | |
| repeated_times = [time] * len(flat_latitudes) |
There was a problem hiding this comment.
This seems to yield identical results and does not require pandas import into this file. But as pandas is already installed, either option is fine
| times: list[datetime], | ||
| latitudes: np.ndarray, | ||
| longitudes: np.ndarray, | ||
| ) -> xr.Dataset: |
There was a problem hiding this comment.
| ) -> xr.Dataset: | |
| ) -> xr.Dataset: | |
| """Compute a pvlib simplified-Solis clear-sky GHI dataset on a lat/lon grid. | |
| For each timestep in `times`, solar position is computed at every grid | |
| point by repeating the timestamp across all (lat, lon) pairs. The | |
| simplified Solis clear-sky model is then applied to the apparent solar | |
| elevation to produce global horizontal irradiance (GHI) in W m⁻². | |
| Longitudes are normalised to the range [-180, 180] before the solar | |
| position calculation. | |
| Args: | |
| times: Ordered list of timezone-aware datetimes for which to | |
| compute clear-sky irradiance. | |
| latitudes: 1-D array of latitude values in degrees North. | |
| longitudes: 1-D array of longitude values in degrees East | |
| (values > 180 are wrapped to [-180, 180]). | |
| Returns: | |
| xr.Dataset with a single variable keyed by | |
| PVLIB_CLEARSKY_VARIABLE_NAME of shape (time, latitude, longitude) | |
| containing GHI in W m⁻². The time coordinate is stored without | |
| timezone information. | |
| """ |
There was a problem hiding this comment.
A suggestion for a docstring
| ## [Unreleased] | ||
| * Made minutes available for formatting file names, @JoachimKoenigslieb | ||
| - use `.expand_dims` instead of `.assign_coords` to make sure we have both time dimension and time coordinates when loading from files, @JoachimKoenigslieb | ||
| - Added a helper `make_pvlib_clearsky_dataset` which creates clearsky data by calling `pvlib`. Also added a new config variable `clearsky_source` which defaults to `file` and switches behavior when set to `pvlib`, @JoachimKoenigslieb | ||
| - `check_solar_elevation` now does not assume location is in Copenhagen by default, @JoachimKoenigslieb | ||
|
|
||
|
|
||
| ### Added | ||
|
|
||
| - Added a check for the number of ensemble members, as the code currently supports only one [!13](https://github.com/dmidk/sunflow/pull/13), @KristianHMoller | ||
| - Subsetting to bounding box is now also done in the `s3` and `files` code paths [!11](https://github.com/dmidk/sunflow/pull/11), @JoachimKoenigslieb | ||
| - Subsetting to bounding box now correctly handles both ascending and descending lat/lon coordinates [!11](https://github.com/dmidk/sunflow/pull/11), @JoachimKoenigslieb | ||
|
|
||
| ### Changed | ||
|
|
||
| - Pass down number of ensembles from configurations to `ProbabilisticAdvection`. This gives a roughly 3x speedup in the no ensembles case [!12](https://github.com/dmidk/sunflow/pull/12), @JoachimKoenigslieb |
There was a problem hiding this comment.
| ## [Unreleased] | |
| ### Added | |
| - Added a helper `make_pvlib_clearsky_dataset` which creates clearsky data by calling `pvlib`. Also added a new config variable `clearsky_source` which defaults to `file` and switches behavior when set to `pvlib` [!15](https://github.com/dmidk/sunflow/pull/15), @JoachimKoenigslieb | |
| - Made minutes available for formatting file names [!15](https://github.com/dmidk/sunflow/pull/15), @JoachimKoenigslieb | |
| - Added a check for the number of ensemble members, as the code currently supports only one [!13](https://github.com/dmidk/sunflow/pull/13), @KristianHMoller | |
| - Subsetting to bounding box is now also done in the `s3` and `files` code paths [!11](https://github.com/dmidk/sunflow/pull/11), @JoachimKoenigslieb | |
| - Subsetting to bounding box now correctly handles both ascending and descending lat/lon coordinates [!11](https://github.com/dmidk/sunflow/pull/11), @JoachimKoenigslieb | |
| ### Changed | |
| - `check_solar_elevation` now does not assume location is in Copenhagen by default [!15](https://github.com/dmidk/sunflow/pull/15), @JoachimKoenigslieb | |
| - Use `.expand_dims` instead of `.assign_coords` to make sure we have both time dimension and time coordinates when loading from files [!15](https://github.com/dmidk/sunflow/pull/15), @JoachimKoenigslieb | |
| - Pass down number of ensembles from configurations to `ProbabilisticAdvection`. This gives a roughly 3x speedup in the no ensembles case [!12](https://github.com/dmidk/sunflow/pull/12), @JoachimKoenigslieb |
There was a problem hiding this comment.
I am not sure if the "suggestion" works as intended here.
|
All implemented functionality seems to work as intended! This is great! |
AdamRJensen
left a comment
There was a problem hiding this comment.
I left a few comments, but my main concern that I want to raise is that climatological clear sky models as used in this PR are not ideal for forecasting applications given that water vapor varies significantly from day to day. But if this is for testing purposes, then it's fine to add.
Using CAMS clear sky data based on forecasting of aerosols is the way forward in my opinion.
| solar_position = pvlib.solarposition.get_solarposition( | ||
| repeated_times, | ||
| flat_latitudes, | ||
| flat_longitudes, |
There was a problem hiding this comment.
| flat_longitudes, | |
| flat_longitudes, | |
| method='nrel_numpy', |
I suggest explicitly denoting which solar position model is used.
On a different note, the 'nrel_numpy' model is the most accurate but unnecessarily slow. Based on my recent research, solar position models such as Michalsky or USNO are plenty fine: https://solposx.readthedocs.io/. How fast they are is very implementation dependent, and most of these implementations are not developed for a grid of latitudes/longitudes.
| - Added a helper `make_pvlib_clearsky_dataset` which creates clearsky data by calling `pvlib`. Also added a new config variable `clearsky_source` which defaults to `file` and switches behavior when set to `pvlib`, @JoachimKoenigslieb | ||
| - `check_solar_elevation` now does not assume location is in Copenhagen by default, @JoachimKoenigslieb |
There was a problem hiding this comment.
pvlib supports a lot of clear sky models, even remotely retrieving clear sky data from CAMS. I recommend being explicit here about which method is being used.
I wanted to use
sunflowto do nowcasting over the US with data from GOES.I ultimately get the data from this public AWS source here https://registry.opendata.aws/noaa-goes/.
Unfortunately my files are stored converted to grib2, so I'm getting into some quite off-label usage here. Maybe this is not something you guys are interested in supporting up stream here? The specific changes needed to support it are quite general and feels like reasonable changes tough.
Turns out that xarray is happy enough to read my files out of the box if I install
cfgribas a peer-dependency ofsunflow. One big issue remains: I'm not storing the clearsky in these grib2 files! Instead I made a small config switch that allows for fetching the clearksy via the already-installedpvlib.A list of fixes implemented are:
.expand_dimsinstead of.assign_coordsto make sure we have both time dimension AND time coordinates when loadingmake_pvlib_clearsky_datasetwhich makes clearsky by callingpvlib. Also added a new config variableclearsky_sourcewhich defaults tofileand switches behavior when set topvlib.check_solar_elevationnow does not take Copenhagen as a default as that can easily lead to silent buggy behavior.With these changes I'm able to do nowcasts over CONUS:
