Skip to content

Update Delft3D-FLOW submit script to slurm (AL8)#1007

Open
hrajagers wants to merge 9 commits into
mainfrom
d3d4/task/DELFT3D-37935_update_submit_script
Open

Update Delft3D-FLOW submit script to slurm (AL8)#1007
hrajagers wants to merge 9 commits into
mainfrom
d3d4/task/DELFT3D-37935_update_submit_script

Conversation

@hrajagers

@hrajagers hrajagers commented Jun 25, 2026

Copy link
Copy Markdown
Member

What was done

Update Delft3D-FLOW submit script to slurm consistent with our Alma 8 cluster.
Replaced the content of submit_dflow2d3d.sh and rd2d3d.sh scripts by that of the _AL8 versions that some internal DIMRsets included. Replaced instead of including additional files since the previous H6 system is no longer available anyway.

Evidence of the work done

  • Video/figures
    <add video/figures if applicable>
  • Clear from the issue description
  • Not applicable

Tests

  • Tests updated
    <add testcase numbers if applicable, Issue number>
  • Not applicable

Documentation

  • Documentation updated
    <add description of changes if applicable, Issue number>
  • Not applicable

Issue link

Closes DELFT3D-37935

@hrajagers hrajagers self-assigned this Jun 25, 2026

@willem4 willem4 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update of the scripts. I have not tested the scripts, as I trust you have. I have some minor suggestions.

#SBATCH --job-name=delft3d4 # Specify a name for the job allocation.
#SBATCH --time=00:15:00 # Set a limit on the total run time of the job allocation.
#SBATCH --partition=4vcpu # Request a specific partition for the resource allocation.
# See: https://publicwiki.deltares.nl/display/Deltareken/Compute+nodes.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to include this in the run script? Or is it accessible for externals?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldnt this (and the other files) receive the _AL8 as described in the text? Then it is clear that it is tailored for Alma8/Slurm

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Due to the number of different running script, can you add a comment line about when to use this sh-file file?
One line such as: "This is the master calling script to run ....."

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the reference to the publicwiki page. It's not accessible outside Deltares. The AL8 slurm cloud isn't accessible either. Both scripts should work fine on other slurm systems ... except when using C-SUMO within MATLAB as that still includes a hardcoded path for the MATLAB installation.

I don't want to keep the _AL8 postfix. There is nothing in the script doesn't contain anything Alma Linux 8 specific. The primary dependencies are (1) module intelmpi/2021.11.0, (2) MATLAB in case of C-SUMO, obviously (3) Delft3D installation, and (4) slurm resource management.

A purpose statement added to the start of both scripts.

debuglevel=-1
runscript_extraopts=
NNODES=1
wavefile=runwithoutwaveonlinebydefault

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
wavefile=runwithoutwaveonlinebydefault
# wavefile is set to "runwithoutwaveonlinebydefault". When adding the option -w, --wavefile <wname> in the run script a different file can be specified and this activates wave online.
wavefile=runwithoutwaveonlinebydefault

fi

if [ $NSLOTS -eq 1 ]; then

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the condition correct here. A single node, multi core also uses mpi right?

numnode=1
queue=normal-e3-c7
runscript_extraopts=
wavefile=runwithoutwaveonlinebydefault

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider above suggestion

echo "-p, --PARTITION <PARTITION>"
echo " PARTITION, default $PARTITION"
echo " see also: https://publicwiki.deltares.nl/display/Deltareken/Compute+nodes"
echo "-t, --TIME_LIMIT <TIME_LIMIT>"

@willem4 willem4 Jun 26, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it useful to add the formatting of the time string?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

echo " number of nodes, default $NODES"
echo "-p, --PARTITION <PARTITION>"
echo " PARTITION, default $PARTITION"
echo " see also: https://publicwiki.deltares.nl/display/Deltareken/Compute+nodes"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be included in external scripts?

# Usage:
#
# This script runs Delft3D-FLOW in parallel on Linux
# This script runs Delft3D-FLOW in parallel on Linux H7

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is H7 necessary?

@marcio-albernaz marcio-albernaz left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Bert,
Nice that you worked on it! ;)
I havent tested the scripts myself. But look good. Mainly comments to improve clarity

#SBATCH --job-name=delft3d4 # Specify a name for the job allocation.
#SBATCH --time=00:15:00 # Set a limit on the total run time of the job allocation.
#SBATCH --partition=4vcpu # Request a specific partition for the resource allocation.
# See: https://publicwiki.deltares.nl/display/Deltareken/Compute+nodes.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldnt this (and the other files) receive the _AL8 as described in the text? Then it is clear that it is tailored for Alma8/Slurm

echo "Usage: ${0##*/} [OPTION]..."
echo "This script is called by submit_dflow2d3d.sh."
echo "Usage: sbatch [SLURM OPTIONS]... ${0##*/} [OPTION]..."
echo "Run delft3d4 on H7."

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

H7 or Hal8?

corespernodedefault=1
corespernode=$corespernodedefault
D3D_HOME=
NSLOTS=$SLURM_TASKS_PER_NODE

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is SLURM_TASKS_PER_NODE defined? Like Willem, I havent test it myself. But I don't see this variable being defined here

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SLURM_TASKS_PER_NODE is set by the slurm process. It should be equal to the TASKS_PER_NODE value in the submit script and passed to slurm by means of the -c (--corespernode) option.

Your comment made me realize that NSLOTS was set equal to SLURM_TASKS_PER_NODE. I have adjusted this to SLURM_NTASKS (or 1 if not set).

export I_MPI_FABRICS=ofi
export FI_PROVIDER=tcp
export I_MPI_OFI_PROVIDER=tcp
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a recent discussion with Adri about OMP_NUM_THREADS and OMP_NUM_THREADS_SWAN. Nowadays, for FM, Swan is controlled by the latter. Which one is correct for D3D-4? Good to double check it and perhaps make it consistent with OMP_NUM_THREADS_SWAN

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, OMP_NUM_THREADS for SWAN is set to OMP_NUM_THREADS_SWAN inside the swan.sh file (which is used to start the swan executable by both Delft3D-WAVE and D-Waves). I've now also set OMP_NUM_THREADS_SWAN ... the OMP_NUM_THREADS variable could be used by Delft3D-FLOW, but isn't used now (afaik).

#SBATCH --job-name=delft3d4 # Specify a name for the job allocation.
#SBATCH --time=00:15:00 # Set a limit on the total run time of the job allocation.
#SBATCH --partition=4vcpu # Request a specific partition for the resource allocation.
# See: https://publicwiki.deltares.nl/display/Deltareken/Compute+nodes.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Due to the number of different running script, can you add a comment line about when to use this sh-file file?
One line such as: "This is the master calling script to run ....."

@hrajagers

Copy link
Copy Markdown
Member Author

Thank you @marcio-albernaz and @willem4 for your comments. This pull request started as quick action for PK to get the submit_dflow2d3d_AL8.sh script for running on "AL8" systematically included in the DIMRset folder (it's now only included in 2.29.14 and 2.30.12). It made sense to remove the old submit_dflow2d3d.sh script in the process since the H6 is no longer available. However, you're asking all kinds of questions that I don't have an immediate answer for. So, let's try to find some time next week to discuss the best way forward. These scripts are primarily intended for internal use (e.g. the name change from H7 to Alma8 suggests that the script can be used on all Alma8 systems, but I expect that it depends more on the slurm scheduler than on the OS version itself) and I know that there is a discussion whether they should be here or somewhere else.

@marcio-albernaz

Copy link
Copy Markdown
Contributor

Hi @hrajagers , I am fine to skip the phylosophical and cosmetic reviews. But I had at least one about the definition of a variable that may actually cause problems. I am fine if you restrict to check the technical points only. And maybe clarify if you actually tested it, which I am almost sure you did :)

@hrajagers

Copy link
Copy Markdown
Member Author

@marcio-albernaz Sorry to disappoint you. I didn't test it ... I just copied the files that have been unchanged for the past two years. So, either nobody is using them, or other users have tested them. Let's discuss next week in person (PK is running some tests now ... using a manual copy of the files).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If these files are renamed, then we also need to adjust
src\engines_gpl\flow2d3d\packages\flow2d3d\CMakeLists.txt
which refers to these files by full name for copying as part of the "install" step.

hrajagers added 5 commits July 2, 2026 15:19
… the submit script doesn't have to know anything about the C-SUMO keywords ...

updated the bash conditions
updated the C-SUMO checking inside rd2d3d.sh ... the previous [ -n $csumoscript ] check only convers the compiled C-SUMO case
… to copy the dedicated submit scripts anymore.

Skip processing of RTC and WAVE flags in the submit script ... they are processed by the next rd2d3d script.

@marcio-albernaz marcio-albernaz left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @bert, before approving it - can you say if these scripts have been tested?
The previous comments from Willem and I are also still open. You can close if they are not relevant anymore.

@hrajagers hrajagers left a comment

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments processed. I did some very basic testing a while back, but I run a few more simulations using the latest scripts. Furthermore, I want to have a green test bench before approval. Since these scripts are not used in the testbench, all failures should be unrelated to these changes ... but a green test bench also doesn't give any guarantee that these scripts are correct.

#SBATCH --job-name=delft3d4 # Specify a name for the job allocation.
#SBATCH --time=00:15:00 # Set a limit on the total run time of the job allocation.
#SBATCH --partition=4vcpu # Request a specific partition for the resource allocation.
# See: https://publicwiki.deltares.nl/display/Deltareken/Compute+nodes.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the reference to the publicwiki page. It's not accessible outside Deltares. The AL8 slurm cloud isn't accessible either. Both scripts should work fine on other slurm systems ... except when using C-SUMO within MATLAB as that still includes a hardcoded path for the MATLAB installation.

I don't want to keep the _AL8 postfix. There is nothing in the script doesn't contain anything Alma Linux 8 specific. The primary dependencies are (1) module intelmpi/2021.11.0, (2) MATLAB in case of C-SUMO, obviously (3) Delft3D installation, and (4) slurm resource management.

A purpose statement added to the start of both scripts.

corespernodedefault=1
corespernode=$corespernodedefault
D3D_HOME=
NSLOTS=$SLURM_TASKS_PER_NODE

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SLURM_TASKS_PER_NODE is set by the slurm process. It should be equal to the TASKS_PER_NODE value in the submit script and passed to slurm by means of the -c (--corespernode) option.

Your comment made me realize that NSLOTS was set equal to SLURM_TASKS_PER_NODE. I have adjusted this to SLURM_NTASKS (or 1 if not set).

export I_MPI_FABRICS=ofi
export FI_PROVIDER=tcp
export I_MPI_OFI_PROVIDER=tcp
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, OMP_NUM_THREADS for SWAN is set to OMP_NUM_THREADS_SWAN inside the swan.sh file (which is used to start the swan executable by both Delft3D-WAVE and D-Waves). I've now also set OMP_NUM_THREADS_SWAN ... the OMP_NUM_THREADS variable could be used by Delft3D-FLOW, but isn't used now (afaik).

echo "-p, --PARTITION <PARTITION>"
echo " PARTITION, default $PARTITION"
echo " see also: https://publicwiki.deltares.nl/display/Deltareken/Compute+nodes"
echo "-t, --TIME_LIMIT <TIME_LIMIT>"

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

@hrajagers hrajagers changed the title Update Delft3D-FLOW submit script to AL8/H7 Update Delft3D-FLOW submit script to slurm (AL8) Jul 3, 2026

@marcio-albernaz marcio-albernaz left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, here testing the scripts is important because it won't be tested in the tesbench (as far as I know).
Happy to approve it and see what happens :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants