Skip to content

[FEAT] New Sanchez-Franks 2021 Reader#137

Merged
isaschmitz merged 7 commits into
AMOCcommunity:mainfrom
isaschmitz:issue128-isa-SanchezFranks
May 21, 2026
Merged

[FEAT] New Sanchez-Franks 2021 Reader#137
isaschmitz merged 7 commits into
AMOCcommunity:mainfrom
isaschmitz:issue128-isa-SanchezFranks

Conversation

@isaschmitz
Copy link
Copy Markdown
Collaborator

@isaschmitz isaschmitz commented May 16, 2026

Description:
The feature includes a reader for the Sanchez-Franks satellite reconstruction from 2021.

New files were added for that which enable reading the data and mapping metadata: sf2021.py and sf2021.yml and the new data file called altimetry_moc_transport_1993_2020_18mos_smoothed.nc was also added.
The reader sf2021.py also handles a special time format case which was present in the dataset. As this was the only data file which had this issue yet, this was handled as a local issue within the reader, with the possibility to generalize in the future when this problim arises again.

Some files were adjusted to account for the new reader like:

  • __init__.py (added sf2021 link)
  • read.py (added sf2021 link)
  • reader_utils.py (added sf2021 link)
  • readers.py (added sf2021 link)
  • standardise.py (added snippet for sf2021)

Also tests were added. A new file was created called test_sf2021.py, which tests the basic reader function much like test_noac47n.py or test_fbc.py and also especially looks at the time decoding done in sf2021.py.

Related Issues / Pull Requests:

The new feature includes a reader for the Sanchez-Franks satellite reconstruction from 2021.

New files were added for that which enable reading the data and mapping metadata: `sf2021.py` and `sf2021.yml` and the new data file called `altimetry_moc_transport_1993_2020_18mos_smoothed.nc`.

Some files were adjusted to account for the new reader like:
- `__init__.py`
- `read.py`
- `reader_utils.py`
- `readers.py`
- `standardise.py`

Also tests were added to adjust for it, one is located in `test_readers.py` and one new file was created called `test_sf2021.py`.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a new data reader for the Sanchez-Franks et al. 2021 satellite-based reconstruction of the AMOC at 26°N. A new reader module and its YAML metadata are added, the dataset is wired into read/readers/reader_utils/data_sources registries, the generic standardize_time_coordinate helper is broadened to handle SF2021's numeric time (no units attribute), tests are added (test_sf2021.py plus one in test_readers.py), and the demo notebook gets an SF2021 section.

Changes:

  • New sf2021 reader module + YAML metadata, with reader registered in data_sources/__init__.py, read.py, readers.py, and reader_utils.py.
  • standardise.standardize_time_coordinate gains a value-based heuristic to interpret missing-units numeric times as days since 0000-01-01, and a new deprecated standardise_sf2021 wrapper is added.
  • Tests and a demo notebook section for the new reader.

Reviewed changes

Copilot reviewed 10 out of 11 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
amocatlas/data_sources/sf2021.py New reader module for the SF2021 dataset.
amocatlas/data_sources/init.py Exposes read_sf2021.
amocatlas/metadata/sf2021.yml YAML metadata, variable mapping, and file source URL.
amocatlas/read.py Adds sf2021 to public read API and known arrays.
amocatlas/readers.py Adds sf2021 to reader dispatch and load_dataset docs.
amocatlas/reader_utils.py Adds friendly name mapping for sf2021 (and nac).
amocatlas/standardise.py Broadens time-coordinate decoding with a magnitude-based heuristic; adds deprecated standardise_sf2021.
tests/test_sf2021.py New unit tests for the SF2021 reader.
tests/test_readers.py Integration test for SF2021 time decoding via read.sf2021().
notebooks/demo.ipynb Demo cells loading and plotting the SF2021 timeseries.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread amocatlas/standardise.py Outdated
Comment thread amocatlas/standardise.py Outdated
Comment thread amocatlas/data_sources/sf2021.py Outdated
Comment thread amocatlas/metadata/sf2021.yml Outdated
Comment thread amocatlas/standardise.py
Update small things in `sf2021.py` regarding formatting and inside `sf2021.yml` to provide the correct date coverage and changed the function for time standardisation inside `standardise.py` to have a more robust fallback and just handle sf2021 as an exception. Also added tests for the new logic inside `test_standardise.py`.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 12 changed files in this pull request and generated 5 comments.

Comments suppressed due to low confidence (2)

amocatlas/standardise.py:705

  • The except clause no longer catches ValueError, which was previously handled. pd.to_datetime(..., errors="coerce") can still raise ValueError for malformed unit strings (e.g. the units parsing path), and np.datetime64("0000-01-01", "D") may raise ValueError in some numpy versions. Dropping ValueError from the caught exceptions is a behavior change that can turn previously-handled cases into unhandled crashes. Consider keeping ValueError in the tuple.
            except (TypeError, OverflowError, pd.errors.OutOfBoundsDatetime) as e:

amocatlas/standardise.py:694

  • The SF2021 day-count branch uses np.datetime64("0000-01-01", "D") as the base. Numpy's datetime64[ns] representable range is approximately 1678–2262, and intermediate values near year 0 (e.g. [1, 2, 3] days as in the new test) cannot be safely cast to datetime64[ns] — the cast either overflows or produces incorrect dates. The test test_standardize_time_coordinate_allows_sf2021_day_counts only asserts the resulting value does not start with "1970-01-01", which would also be true for an overflowed/garbage datetime, so it does not actually validate correctness. Please either document/validate the expected epoch (MATLAB datenum vs. days since 0000-01-01 vs. days since 0001-01-01 — these differ by 1–2 days) and add an assertion on a known concrete date (e.g. that day 727930 maps to 1993-01-17 per the YAML's time_coverage_start), or use a base epoch within the datetime64[ns] range plus an explicit offset.
                elif source_file == "altimetry_moc_transport_1993_2020_18mos_smoothed.nc":
                    # SF2021 uses unlabeled day counts in the raw file.
                    base_dt = np.datetime64("0000-01-01", "D")
                    sf2021_time = base_dt + time_coord.values.astype("timedelta64[D]")
                    time_datetime = sf2021_time.astype("datetime64[ns]")

Comment thread amocatlas/data_sources/sf2021.py
Comment thread amocatlas/standardise.py Outdated
Comment thread amocatlas/read.py Outdated
Comment thread tests/test_readers.py Outdated
Comment thread amocatlas/standardise.py
Updated the time handling to not be part of the `standardise.py` module as it is very specific but rather be handled inside the reader `sf2021.py` itself.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 12 changed files in this pull request and generated 4 comments.

Comments suppressed due to low confidence (2)

amocatlas/data_sources/sf2021.py:86

  • The TIME normalization is likely incorrect for fractional-day values: converting via timedelta64[D] will truncate fractions (e.g., 727974.5 becomes 727974), and np.datetime64('0000-01-01','ns') is outside the representable range for datetime64[ns] on many NumPy builds. Consider converting days since 0000-01-01 by first shifting to a safe origin (e.g., compute an integer day offset between 0000-01-01 and 1970-01-01 using datetime64[D]), then use pandas/xarray to add a Timedelta in days (preserving fractions) to a 1970 origin.
        time_values = ds[time_var].values
        time_delta = np.array(time_values, dtype="timedelta64[D]").astype("timedelta64[ns]")
        base_dt = np.datetime64("0000-01-01", "ns")
        time_datetime = (base_dt + time_delta).astype("datetime64[ns]")
        
        # Use assign_coords to properly set dimension coordinate
        ds = ds.assign_coords({time_var: time_datetime})
        ds[time_var].attrs = _TIME_METADATA

amocatlas/data_sources/sf2021.py:102

  • read_sf2021 is annotated/documented as returning only list[xr.Dataset], but when track_added_attrs=True it returns (datasets, added_attrs_per_dataset). Please update the return type annotation and docstring to reflect the dual return shape (matching the pattern used in other readers like read_nac).
) -> list[xr.Dataset]:

Comment thread amocatlas/data_sources/sf2021.py Outdated
Comment thread tests/test_readers.py Outdated
Comment thread docs/source/reports/noac47n_report.rst Outdated
Comment thread tests/test_sf2021.py Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 10 out of 11 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (2)

amocatlas/data_sources/sf2021.py:82

  • This time conversion is very likely broken: np.datetime64('0000-01-01', 'ns') is out of the representable range for datetime64[ns] (it will raise / produce NaT), and casting float days to timedelta64[D] drops fractional days (e.g. 0.5). Consider converting relative to a representable epoch (e.g. compute days since 1970-01-01 by subtracting the 0000→1970 day offset) and preserve sub-day fractions, or use cftime.num2date and then convert to datetime64[ns] for in-range dates. Also ensure the attrs you set match the fact the coordinate is datetime-like (avoid 'seconds since 1970...' when values are datetime64).
        time_values = ds[time_var].values
        time_delta = np.array(time_values, dtype="timedelta64[D]").astype("timedelta64[ns]")
        base_dt = np.datetime64("0000-01-01", "ns")
        time_datetime = (base_dt + time_delta).astype("datetime64[ns]")
        

amocatlas/data_sources/sf2021.py:184

  • SF2021 time decoding depends on having the raw numeric "days since 0000-01-01" values available. ReaderUtils.safe_load_dataset() uses xr.open_dataset() with default decode_times=True, which can coerce this into cftime/object or otherwise make downstream normalization skip. Consider loading this dataset with decode_times=False (as MOVE does) or explicitly handling cftime/object TIME coordinates in _normalize_sf2021_time_coordinate.
            # Use ReaderUtils for consistent dataset loading

            ds = ReaderUtils.safe_load_dataset(file_path)
            

Comment thread amocatlas/data_sources/sf2021.py
Comment thread tests/test_sf2021.py
update test to capture the change in months in the first two timesteps
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 10 out of 11 changed files in this pull request and generated 3 comments.

Comment thread amocatlas/data_sources/sf2021.py Outdated
Comment thread amocatlas/data_sources/sf2021.py
Comment thread tests/test_sf2021.py Outdated
changed the time handling function to something more robust
@isaschmitz isaschmitz requested a review from Copilot May 20, 2026 13:44
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 10 out of 11 changed files in this pull request and generated 8 comments.

Comment thread amocatlas/data_sources/sf2021.py
Comment thread amocatlas/data_sources/sf2021.py
Comment thread amocatlas/data_sources/sf2021.py
Comment thread amocatlas/data_sources/sf2021.py
Comment thread amocatlas/data_sources/sf2021.py
Comment thread amocatlas/standardise.py
Comment thread tests/test_sf2021.py
Comment thread tests/test_sf2021.py
@isaschmitz isaschmitz merged commit dea9b60 into AMOCcommunity:main May 21, 2026
7 of 8 checks passed
@isaschmitz isaschmitz deleted the issue128-isa-SanchezFranks branch May 21, 2026 08:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add SanchezFranks 2021 reconstruction

2 participants