|
4 | 4 | [](https://gents.readthedocs.io/en/latest/) |
5 | 5 |  |
6 | 6 |
|
| 7 | + |
7 | 8 | The GenTS (Generate Time Series) is an open-source Python Package designed to simplify the post-processing of history files into time series files. This package includes streamlined functions that require minimal input to operate and a documented API for custom workflows. |
8 | 9 |
|
9 | 10 | ## Installation |
10 | 11 |
|
11 | | -GenTS can be installed using `pip`: |
| 12 | +GenTS can be installed in a Python environment using `pip`. This requires either a Conda or Python virtual environment for installing GenTS depedencies (namely `numpy`, `netCDF4`, and `cftime`). |
| 13 | + |
| 14 | +For maximum portability and to avoid environment issues, use the containerized version of GenTS. |
| 15 | + |
| 16 | +### PyPI |
12 | 17 |
|
13 | 18 | ``` |
14 | | -pip install gents['parallel'] |
| 19 | +pip install gents |
15 | 20 | ``` |
16 | 21 |
|
17 | | -Although it is reccomended to use the Dask implementation, you may wish to implement your own parallel solution for large datasets. If you then don't want to include Dask in your installation, you may omit `['parallel']` from the command (this limits GenTS to only run in serial). |
18 | | - |
19 | 22 | To install from source, please view the [ReadTheDocs Documentation](https://gents.readthedocs.io/en/latest/). |
20 | 23 |
|
21 | | -## Example |
| 24 | +### Container |
22 | 25 |
|
23 | | -Barebones starting example: |
| 26 | +Apptainer and Singularity container platforms are typically employed over Docker in HPC environments. Luckily, these platforms (and most others) support running directly from Docker images. The form thus varies across institutions and systems: |
24 | 27 |
|
| 28 | +**For Derecho and Casper (NCAR)**: |
| 29 | +``` |
| 30 | +module load apptainer |
| 31 | +apptainer run --bind /glade/derecho --cleanenv docker://agentoxygen/gents:latest run_gents --help |
| 32 | +``` |
| 33 | + |
| 34 | +**For TACC Systems**: |
| 35 | +``` |
| 36 | +module load apptainer |
| 37 | +apptainer run docker://agentoxygen/gents:latest run_gents --help |
| 38 | +``` |
| 39 | + |
| 40 | +**For Perlmutter (NERSC)**: |
| 41 | +``` |
| 42 | +shifterimg -v pull docker:agentoxygen/gents:latest |
| 43 | +shifter --image=docker:agentoxygen/gents:latest run_gents --help |
25 | 44 | ``` |
26 | | -from gents.hfcollection import HFCollection |
27 | | -from gents.timeseries import TSCollection |
28 | | -from dask.distributed import LocalCluster, Client |
29 | 45 |
|
30 | | -cluster = LocalCluster(n_workers=30, threads_per_worker=1, memory_limit="2GB") |
31 | | -client = cluster.get_client() |
| 46 | +## Running GenTS |
32 | 47 |
|
33 | | -input_head_dir = "... case directory with model output ..." |
34 | | -output_head_dir = "... scratch directory to output time series to ..." |
| 48 | +GenTS comes with a pre-configured CLI that can be run on most CESM model output and E3SM (atm-only) model output by calling `run_gents`. The CLI is built on a robust API which can also be configured in a Python script or Jupyter Notebook for custom cases/workflows. |
35 | 49 |
|
36 | | -hf_collection = HFCollection(input_head_dir) |
37 | | -hf_collection = hf_collection.include_patterns(["*/atm/*", "*/ocn/*", "*.h4.*"]) |
38 | | -hf_collection.pull_metadata() |
| 50 | +### CLI |
39 | 51 |
|
40 | | -ts_collection = TSCollection(hf_collection.include_years(0, 5), output_head_dir) |
41 | | -ts_collection = ts_collection.apply_overwrite("*") |
42 | | -ts_collection.execute() |
| 52 | +To view options for running in the command line: |
| 53 | + |
| 54 | +``` |
| 55 | +run_gents --help |
43 | 56 | ``` |
44 | 57 |
|
45 | | -The serial equivalent (without Dask) is the same, just without the Dask `Client` or `LocalCluster`: |
| 58 | +### API Example |
| 59 | + |
| 60 | +Example `run.py`: |
46 | 61 |
|
47 | 62 | ``` |
48 | 63 | from gents.hfcollection import HFCollection |
49 | 64 | from gents.timeseries import TSCollection |
50 | 65 |
|
51 | | -input_head_dir = "... case directory with model output ..." |
52 | | -output_head_dir = "... scratch directory to output time series to ..." |
53 | 66 |
|
54 | | -hf_collection = HFCollection(input_head_dir) |
55 | | -hf_collection = hf_collection.include_patterns(["*/atm/*", "*/ocn/*", "*.h4.*"]) |
56 | | -hf_collection.pull_metadata() |
| 67 | +if __name__ == "__main__": |
| 68 | + input_head_dir = "... case directory with model output ..." |
| 69 | + output_head_dir = "... scratch directory to output time series to ..." |
| 70 | +
|
| 71 | + hf_collection = HFCollection(input_head_dir, num_processes=64) |
| 72 | + hf_collection = hf_collection.include(["*/atm/*", "*/ocn/*", "*.h4.*"]) |
| 73 | +
|
| 74 | + ts_collection = TSCollection(hf_collection.include_years(0, 5), output_head_dir, num_processes=32) |
| 75 | + ts_collection = ts_collection.apply_overwrite("*") |
| 76 | + ts_collection.execute() |
| 77 | +``` |
| 78 | + |
| 79 | +Then execute the script in a Conda or Python virtual environment with `gents` installed: |
| 80 | + |
| 81 | +``` |
| 82 | +python run.py |
| 83 | +``` |
| 84 | + |
| 85 | +Or run from the container: |
57 | 86 |
|
58 | | -ts_collection = TSCollection(hf_collection.include_years(0, 5), output_head_dir) |
59 | | -ts_collection = ts_collection.apply_overwrite("*") |
60 | | -ts_collection.execute() |
| 87 | +``` |
| 88 | +apptainer run docker://agentoxygen/gents:latest run.py |
61 | 89 | ``` |
62 | 90 |
|
63 | 91 | ## Contributor/Bug Reporting Guidelines |
|
0 commit comments