deep4production

Description

deep4production is a command-line framework designed to streamline the end-to-end workflow for deep learning–based climate downscaling in production environments. It provides tools to transform raw climate datasets into AI-ready formats, train deep learning models, and perform inference in a reproducible and scalable way.

The framework relies on the deep4downscaling library, which provides the deep learning architectures and loss functions used during model training.

Workflow Overview

The typical workflow consists of four main steps:

Create AI-ready datasets (d4p-create)

Raw climate datasets (e.g., NetCDF .nc files) are converted into AI-ready datasets stored in the Zarr format. This step standardizes data structures, preprocessing, and chunking to enable efficient deep learning training.
```
d4p-create config.yaml
```
Inspect AI-ready datasets (d4p-inspect)

The generated Zarr datasets can be inspected to verify structure, metadata, variables, and chunking before training.
```
d4p-inspect dataset.zarr
```
Train a deep learning model (d4p-train)

Models from the deep4downscaling framework can be trained using the prepared datasets. Training configurations are defined through YAML configuration files.
```
d4p-train train_config.yaml
```
During training, experiments can be tracked with MLflow, allowing users to store:
- model artifacts
- training metrics
- logs
- configuration files
This makes it possible to compare experiments and select the best-performing model.
Run inference (d4p-predict)

Once a model is selected, predictions can be generated using the trained model and new input data.
```
d4p-predict predict_config.yaml
```

Example YAML configuration files are available in the recipes directory and can be used as templates for your own configurations.

Model Selection and Experiment Tracking

deep4production integrates with MLflow for experiment tracking and model management. By synchronizing training artifacts and logs with MLflow, users can:

compare different training runs
inspect metrics and performance
select the best-performing model
manage model versions

This enables a reproducible and production-ready machine learning workflow.

Supported Data Formats

Input data: NetCDF (.nc)
AI-ready datasets: Zarr (.zarr)

Zarr datasets allow efficient parallel I/O and chunked storage, which is well suited for large climate datasets and distributed training.

Command-Line Interface

The framework provides four main console commands:

Command	Purpose
`d4p-create`	Convert raw NetCDF datasets to AI-ready Zarr format
`d4p-inspect`	Inspect generated AI-ready datasets
`d4p-train`	Train deep learning models
`d4p-predict`	Generate predictions using trained models

Together, these commands implement a complete pipeline for preparing data, training models, and deploying predictions in operational workflows.

Installation

1. Clone the Repository

git clone https://github.com/SantanderMetGroup/deep4production/
cd deep4production

2. Create a Python environment

It is recommended to install the library inside a clean environment (e.g., conda or venv) .

Example using conda:

conda create -n deep4production python=3.13
conda activate deep4production

3. Install dependencies (Optional but recommended)

A stable set of library versions is provided in requirements.txt. To reproduce the tested environment, install them with:

pip install -r requirements.txt

4. Install the library

Install the package from the repository:

pip install .

For development purposes, install the library in editable mode so that changes in the source code are immediately reflected:

pip install -e .

5. Install `deep4downscaling` (Not implemented yet: Skip this step)

deep4production relies on the deep4downscaling framework, which provides code for established deep learning downscaling models and loss functions.

pip install git+https://github.com/SantanderMetGroup/deep4downscaling.git

6. Enable GPU support (Optional)

If you plan to run deep learning models on GPUs, install the CUDA-enabled version of PyTorch. For example for CUDA 11.8

pip install torch==2.7.1+cu118 torchvision==0.22.1+cu118 torchaudio==2.7.1+cu118 --index-url https://download.pytorch.org/whl/cu118

You can verify that PyTorch detects the GPU with:

python
import torch
print(torch.cuda.is_available())

Usage

We provide a set of Jupyter notebooks in the notebooks directory that demonstrate the basic functionality of the deep4production library. These notebooks cover topics such as:

Production of AI-ready datasets in Zarr format
Model training and integration with Mlflow
Model inference
Data visualization

As new features are developed and added to deep4production, additional example notebooks will be included to help you stay up-to-date with the latest capabilities.

Documentation

While deep4production does not currently offer a formal documentation website, all library functions include comprehensive docstrings describing their purpose, parameters, and return values. This ensures that the code is self-explanatory for developers who want to use or extend the library.

For further guidance on how to use deep4production, please refer to:

The notebooks in notebooks, which provide example workflows.
The docstrings in the source code, which offer detailed explanations of functions and classes.

Should you have any questions or need clarifications, feel free to open an issue or contribute to improving the documentation.

Contributing

We use two main branches:

main: stable, release-ready code (default branch when cloning).
devel: active development / integration branch.

All pull requests must target devel. Direct pushes to main and devel are restricted to the maintainers. Maintainers periodically merge devel into main and create new tagged releases from main.

For collaborators (with write access)

Clone the repository:

git clone https://github.com/SantanderMetGroup/deep4production.git
cd deep4production

Create a feature/fix branch from devel:

git checkout devel
git pull origin devel
git checkout -b feature/my-change

Work and keep in sync with devel (optional but recommended):

git checkout devel
git pull origin devel
git checkout feature/my-change
git merge devel

Push and open a pull request into devel:
```
git push -u origin feature/my-change
```

On GitHub, open a PR with base branch devel (not main).

For external contributors (no write access)

Fork this repository on GitHub to your own account.

Clone your fork:

git clone https://github.com/<your-username>/deep4production.git
cd deep4production

Add the original repo as upstream and fetch:

git remote add upstream https://github.com/SantanderMetGroup/deep4production.git
git fetch upstream

Create a feature/fix branch from upstream/devel:

git checkout -b feature/my-change upstream/devel

Commit your changes and push to your fork:

git add ...
git commit -m "Describe your change"
git push -u origin feature/my-change

Open a pull request to this repository:
- base repository: SantanderMetGroup/deep4production
- base branch: devel
- head repository: <your-username>/deep4production
- compare branch: feature/my-change

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
deep4production		deep4production
images		images
notebooks/Deep4Production Tutorial: CORDEX-BENCH Alps Case Study with DeepESD		notebooks/Deep4Production Tutorial: CORDEX-BENCH Alps Case Study with DeepESD
recipes		recipes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

deep4production

Description

Workflow Overview

Model Selection and Experiment Tracking

Supported Data Formats

Command-Line Interface

Table of Contents

Installation

1. Clone the Repository

2. Create a Python environment

3. Install dependencies (Optional but recommended)

4. Install the library

5. Install `deep4downscaling` (Not implemented yet: Skip this step)

6. Enable GPU support (Optional)

Usage

Documentation

Contributing

For collaborators (with write access)

For external contributors (no write access)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

deep4production

Description

Workflow Overview

Model Selection and Experiment Tracking

Supported Data Formats

Command-Line Interface

Table of Contents

Installation

1. Clone the Repository

2. Create a Python environment

3. Install dependencies (Optional but recommended)

4. Install the library

5. Install deep4downscaling (Not implemented yet: Skip this step)

6. Enable GPU support (Optional)

Usage

Documentation

Contributing

For collaborators (with write access)

For external contributors (no write access)

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

5. Install `deep4downscaling` (Not implemented yet: Skip this step)

Packages