Note
This is a scripting solution for a proof of concept. An operational ready
approach will follow. For adding code for specific datasets please add
your script solution into the scripts/<yourname> folder.
git clone git@github.com:freva-org/grid-doctor.git
cd grid-doctor
python -m pip install -e .For GPU support use
python -m pip install -e .[gpu]For remapping of large grids you should install ESMF through ocnda-forge.
mamba install -c conda-forge -y "esmf=*=mpi_openmpi_*" esmpyimport grid_doctor as gd
ds = gd.cached_open_dataset(["path/to/*.nc"])
max_level = gd.resolution_to_healpix_level(gd.get_latlon_resolution(ds))
weights_dir="/scratch/{user[0]}/{user}/grid-doctor/weights"\
.format(user=getuser(), level=level)
gd.cached_weights(
ds,
level=max_level,
prefer_offline=True,
cache_path=weights_path
)
pyramid = gd.create_healpix_pyramid(
ds,
weights_path=weights_dir,
max_level=max_level
)
gd.save_pyramid_to_s3(
pyramid,
"s3://my-bucket/dataset.zarr",
s3_options=gd.get_s3_options(
"https://s3.eu-dkrz-3.dkrz.cloud",
"~/.s3-credentials.json",
),
)How are our patients doing? Every dataset starts broken and leaves HEALed. If your dataset is still 😢, it needs a doctor — that could be you. Claim a patient, write a script, and turn that frown into 😎.
| Meaning | |
|---|---|
| 😢 | Not started |
| 🩹 | In treatment |
| 😎 | HEALed |
| Dataset | Uploaded to S3 | Script Submitted |
|---|---|---|
| ICON-DREAM | 😎 | 😎 |
| EERIE | 😎 | 😎 |
| ERA5 | 😎 | 😎 |
| CMIP6 | 🩹 | 😎 |
| NextGEMS | 😎 | 😎 |
| ICDC | 😎 | 🩹 |
| ORCHESTRA | 😎 | 😎 |
| PalMod | 😢 | 😢 |
| Dyamond | 😎 | 😎 |
Tip
To claim a dataset, open a PR adding your script to scripts/<dataset>/
and update this table. See Getting Started
for the template.
Create a folder under scripts/ and add your script:
mkdir -p scripts/<yourname>A minimal script using the built-in CLI helpers:
import grid_doctor as gd
import grid_doctor.cli as gd_cli
from data_portal_worker.rechunker import ChunkOptimizer
parser = gd_cli.get_parser("my-dataset", "Convert my-dataset to HEALPix.")
parser.add_argument("--variables", nargs="*", default=["t_2m"])
args = parser.parse_args()
gd_cli.setup_logging_from_args(args)
ds = gd.cached_open_dataset(["path/to/*.nc"])
pyramid = gd.create_healpix_pyramid(ds)
gd.save_pyramid_to_s3(
chunked,
f"s3://{args.s3_bucket}/my-dataset.zarr",
s3_options=gd.get_s3_options(args.s3_endpoint, args.s3_credentials_file),
)Run with verbosity:
python scripts/my-dataset/convert.py my-bucket -vvImportant
Please add a descriptive README about what your script is trying to achieve. Document any problems you ran into.
Caution
DO NOT commit S3 keys or secrets to this repository. Use environment variables or a credentials file.
pip install tox
tox -e docs # build to site/
tox -e docs-serve # live preview at http://127.0.0.1:8000tox -e type-checkAs this is still very much work in progress it is very likely that you will
run into problems. Please note any problems in the README.md file
for your dataset folder. Feel free to submit PRs if there are any issues
with the DatasetAggregator or ChunkOptimizer classes. If you don't feel
comfortable with submitting PRs you can file an issue report
here.
