Skip to content

Commit 23f4f08

Browse files
AgentKumaraVicky Fan
andauthored
Vicky_fixes (#32)
* edit partitions * update partitions --------- Co-authored-by: Vicky Fan <vicky.fan@nesi.org.nz>
1 parent d372c20 commit 23f4f08

1 file changed

Lines changed: 18 additions & 40 deletions

File tree

docs/Using_eRI/Running_Jobs/SLURM_Partitions.md

Lines changed: 18 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -9,18 +9,6 @@ zendesk_article_id: 360000204076
99
zendesk_section_id: 360000030876
1010
---
1111

12-
## General Limits
13-
14-
- No individual job can request more than 20,000 CPU hours. This has
15-
the consequence that a job can request more CPUs if it is shorter
16-
(short-and-wide vs long-and-skinny).
17-
- No user can have more than 1,000 jobs in the queue at a time.
18-
19-
These limits are defaults and can be altered on a per-account basis if
20-
there is a good reason. For example we will increase the limit on queued
21-
jobs for those who need to submit large numbers of jobs, provided that
22-
they undertake to do so with job arrays.
23-
2412
## Partitions
2513

2614
A partition can be specified via the appropriate [sbatch option](../../Getting_Started/Cheat_Sheets/Slurm-Reference_Sheet.md),
@@ -40,7 +28,7 @@ partition then you may receive a warning, please do not ignore this.
4028
E.g.:
4129

4230
```out
43-
sbatch: `hugemem` is not the most appropriate partition for this job, which would otherwise default to `large`. If you believe this is incorrect then contact support and quote the Job ID number.
31+
sbatch: `hugemem` is not the most appropriate partition for this job, which would otherwise default to `compute`. If you believe this is incorrect then contact support and quote the Job ID number.
4432
```
4533

4634
<table><tbody>
@@ -50,7 +38,6 @@ sbatch: `hugemem` is not the most appropriate partition for this job, which woul
5038
<th>Nodes</th>
5139
<th>CPUs/Node</th>
5240
<th>Available Mem/CPU</th>
53-
<th>Available Mem/Node</th>
5441
<th>Max CPUs/job</th>
5542
<th>Description</th>
5643
</tr>
@@ -59,51 +46,44 @@ sbatch: `hugemem` is not the most appropriate partition for this job, which woul
5946
<td>14 days</td>
6047
<td>6</td>
6148
<td>256</td>
62-
<td>? MB</td>
49+
<td>3.7 GB</td>
6350
<td>950 GB</td>
64-
<td>?</td>
6551
<td>Default partition.</td>
6652
</tr>
6753
<tr>
6854
<td>gpu</td>
6955
<td>14 days</td>
7056
<td>1</td>
7157
<td>96</td>
72-
<td>? MB</td>
58+
<td>4.8 GB</td>
7359
<td>470 GB</td>
74-
<td>?</td>
75-
<td></td>
60+
<td>A100.</td>
7661
</tr>
7762
<tr>
7863
<td>hugemem</td>
7964
<td>14 days</td>
8065
<td>2</td>
8166
<td>256</td>
82-
<td>-</td>
67+
<td>14.9 GB</td>
8368
<td>3800 GB</td>
84-
<td>-</td>
8569
<td>Very large amounts of memory.</td>
8670
</tr>
8771
<tr>
8872
<td>interactive</td>
8973
<td>60 days</td>
9074
<td>3<br/></td>
9175
<td>8</td>
92-
93-
<td>-</td>
94-
95-
<td>14 GB</td>
96-
<td>?</td>
97-
<td></td>
76+
<td>1.8 GB</td>
77+
<td>14.8 GB</td>
78+
<td>Partition for interactive jobs.</td>
9879
</tr>
9980
<tr>
10081
<td>vgpu</td>
10182
<td>60 days</td>
10283
<td>4</td>
10384
<td>32</td>
104-
<td>-</td>
85+
<td>13 GB</td>
10586
<td>418 GB</td>
106-
<td>-</td>
10787
<td>Virtual GPUs.</td>
10888
</tr>
10989
</tbody>
@@ -118,30 +98,28 @@ its project. There are other QoSs which you can select with the
11898

11999
### Interactive
120100

121-
Specifying `--qos=interactive` will give the job very high priority, but
122-
is subject to some limits: up to 4 jobs, 16 hours duration, 4 CPUs, 128
123-
GB, and 1 GPU.
101+
Specifying `--qos=interactive` will give a very high priority interactive job.
124102

125103
## Requesting GPUs
126104

127105
| | |
128106
|------------------------|------------------------------------------------------------------------------------------------------------------------------------------------|
129107
| **GPU code** | **GPU type** |
130-
131108
| A100 (`gpu` partition) | NVIDIA Tesla A100 PCIe 40GB cards |
132109
| vgpu | NVIDIA A10 GPGPU, PCIe 24GB cards |
133110

134-
The default GPU type is P100, of which you can request 1 or 2 per node. The vgpu partition contains four virtualised compute nodes, each with a single NVIDIA A10 GPGPU, PCIe 24GB cards.
111+
The default GPU type is A100. The vgpu partition contains four virtualised compute nodes, each with a single NVIDIA A10 GPGPU, PCIe 24GB cards.
112+
113+
To request for the A100 GPU:
135114

136115
``` sl
137-
#SBATCH --gpus-per-node=1 # or equivalently, P100:1
116+
#SBATCH --partition gpu
117+
#SBATCH --gpus-per-node 1 # GPU resources required per node
138118
```
139119

140-
To request A100 GPUs, use instead:
120+
To request for vGPUs, use instead:
141121

142122
``` sl
143-
#SBATCH --gpus-per-node=A100:1
123+
#SBATCH --partition vgpu
124+
#SBATCH --gpus-per-node 1
144125
```
145-
146-
See [GPU use on NeSI](../../Scientific_Computing/Running_Jobs_on_Maui_and_Mahuika/GPU_use_on_NeSI.md)
147-
for more details about Slurm and CUDA settings.

0 commit comments

Comments
 (0)