Skip to content
Tomas Chor edited this page Sep 27, 2022 · 4 revisions

Welcome to the Zaratan wiki!

The starting point for the Zaratan docs is here https://hpcc.umd.edu/. The upper red panel has lots of links with useful stuff. This will be better organized later, but for now, here are some useful bits and pieces of knowledge.

Storage space and where to run stuff

  • The home directory is for basic login files, etc., and it isn't meant to run code from. It has a soft quota of 10G and hard limit of 20G which isn't much.
  • ~/scratch is a symbolic link to /scratch/zt1/project/wenegrat-prj/user/youruser/ and that's on he high performance file system. This is what you should use to run jobs out of. The quota for scratch is shared among all users of the /scratch/zt1/project/wenegrat-prj/ project, and we can use scratch_quota to check the usage. (The result I'm getting is around 100 GB, which doesn't seem like a lot of space...)
  • ~/SHELL is a staging or medium-term storage area with 1 TB quota. Apparently it's only available on compute nodes, so we cannot use for job I/O.
  • We can the allocation for the group using sbalance. At the moment of writing I'm showing we have 50 kSU (but I'm not sure how much is 1 kSU...)

Submitting jobs

Zaratan uses the Slurm scheduler for jobs (which is different from the NCAR HPC, which uses PBS; a script to convert from one to the other can be found here). Some examples of scripts can be found here.

For example, to request a single A100 GPU for 15 minutes you can use

#!/bin/tcsh
#SBATCH -t 15:00
#SBATCH -N 1
#SBATCH -p gpu
#SBATCH --gres=gpu

cd /lustre/payerle
./run_my_gpu_code

You can learn more about request specific resources here and about GPUs here.

Modules

Searching for available modules can be done with module avail. For example module avail julia produces

login-1:scratch$ module avail julia
---------------------------------------------------------------------------- /software/modulefiles/modules/rhel8 -----------------------------------------------------------------------------
julia/1.6.3/gcc/(default)        julia/1.6.3/gcc/9.4.0/zen2(default)  julia/gcc/9.4.0/(default)       julia/gcc/9.4.0/zen2/1.6.3  
julia/1.6.3/gcc/9.4.0/(default)  julia/gcc/(default)                  julia/gcc/9.4.0/zen2/(default)  

Loading modules can be done with module load (and this usually recommended to use module purge beforehand to get rid of potentially undesired modules that were previously loaded). So to load Julia and CUDA one can issue:

login-1:scratch$ module purge
login-1:scratch$ module load julia cuda
Loading julia/gcc/9.4.0/zen2/1.6.3
  Loading requirement: gcc/9.4.0
login-1:scratch$ module list
Currently Loaded Modulefiles:
 1) gcc/9.4.0(default)   2) julia/gcc/9.4.0/zen2/1.6.3   3) cuda/gcc/9.4.0/zen2/11.6.2