-
Notifications
You must be signed in to change notification settings - Fork 1
Tutorial 16
If you want to run more than a hundred or so calculations you enter the realm of high-throughput computing. Up to this amount the queue can handle your submissions but when we talk about 200 or 300 or even thousands of calculations you have to be a little more considerate of the system and other users.
Here is an example of a submitter script. Assuming you prepare all the necessary files for calculations in separate folders beneath the directory you are in now (see tutorial 14), you can run the following code as a file in a TMUX session (see tutorial 9).
#!/bin/bash
for f in */; do
[ -d "$f" ] || continue # Skip if not a directory
cd "$f" || continue # Change to directory, skip if it fails
if grep -sq "Voluntary" OUTCAR; then
echo "$f completed successfully"
else
echo "$f not completed. Checking queue"
if (( $(squeue | grep -c "lee") < 20 )); then
echo "There are fewer than 20 jobs in the queue. Submitting..."
sbatch job.script
sleep 60
else
echo "Queue is full, waiting to submit"
sleep 180
echo "Slept, retrying submission..."
sbatch job.script
fi
fi
cd - > /dev/null # Suppress output of 'cd -'
done
The code will check if the VASP calculation in a folder is finished and if so report its finding and move on. If not it checks the number of jobs in the queue. If there are fewer than 20 jobs it will submit the calculation and sleep for 60 seconds. If there are more than 20 jobs it will still submit the current job but only after sleeping for 180 seconds before moving on to the next folder. This is a fairly basic routine where the code speeds up or slows down submission based on how busy the queue is. Depending on how long the average calculation you wish to submit might take, you may want to increase or decrease the sleep durations but the principle is the same.
Here is an example of a script to check if calculations are done and move them if so.
#!/bin/bash
mkdir -p done # Ensure "done" directory exists
for f in */; do
[ -d "$f" ] || continue # Ensure it's a directory
OUTCAR="${f}OUTCAR"
OSZICAR="${f}OSZICAR"
INCAR="${f}INCAR"
JOB_SCRIPT="${f}job_script.sh" # Adjust if the job script has a different name
if [[ ! -f "$OUTCAR" || ! -f "$OSZICAR" || ! -f "$INCAR" ]]; then
echo "Skipping $f: Missing OUTCAR, OSZICAR, or INCAR"
continue
fi
# Extract NSW value from INCAR (handles whitespace better)
NSW=$(grep -E '^\s*NSW\s*=' "$INCAR" | awk -F '=' '{gsub(/ /,"",$2); print $2}')
if [[ -z "$NSW" ]]; then
echo "Skipping $f: Could not determine NSW from INCAR"
continue
fi
if grep -q "Voluntary" "$OUTCAR"; then
if ! grep -q "$NSW F" "$OSZICAR"; then
echo "$f completed successfully"
mv "$f" done/
else
# Extract "d E" values from OSZICAR
dE_values=($(grep -oP '^\s*\d+\s+F=.*?d E =\s*[-+]?\d+\.\d+E[-+]?\d+' "$OSZICAR" | awk '{print $(NF-1)}'))
if [[ ${#dE_values[@]} -eq 0 ]]; then
echo "Skipping $f: No valid d E values found in OSZICAR"
continue
fi
# Define moving average window size
window_size=3 # Adjust as needed
# Function to compute moving average over a window
compute_moving_avg() {
local -n arr=$1
local win_size=$2
local num_values=${#arr[@]}
local moving_avg=()
for ((i = 0; i <= num_values - win_size; i++)); do
sum=0
for ((j = 0; j < win_size; j++)); do
sum=$(echo "$sum + ${arr[i+j]}" | bc -l)
done
avg=$(echo "$sum / $win_size" | bc -l)
moving_avg+=("$avg")
done
echo "${moving_avg[@]}"
}
# Compute moving averages
moving_avg_values=($(compute_moving_avg dE_values $window_size))
# Check if moving average is decreasing
converging=true
for ((i = 1; i < ${#moving_avg_values[@]}; i++)); do
if (( $(echo "${moving_avg_values[i]} > ${moving_avg_values[i-1]}" | bc) )); then
converging=false
break
fi
done
if $converging; then
echo "$f is converging (based on moving average)"
else
echo "$f is NOT converging"
fi
if $converging; then
echo "$f reached NSW but is converging - restarting job"
# Move into the directory, restart the job, then return to original location
if cd "$f"; then
if [[ -f "CONTCAR" ]]; then
cp CONTCAR POSCAR
echo "Copied CONTCAR to POSCAR"
else
echo "Warning: CONTCAR not found in $f"
fi
if command -v sbatch &>/dev/null; then
if [[ -f "$JOB_SCRIPT" ]]; then
sbatch "$JOB_SCRIPT"
echo "Resubmitted job in $f"
else
echo "Warning: Job script not found in $f, skipping sbatch"
fi
else
echo "Error: SLURM is not available. Job not submitted."
fi
cd - > /dev/null # Return to the original directory
else
echo "Failed to enter $f"
continue
fi
else
echo "$f reached NSW and is NOT converging"
fi
fi
fi
done