Batch process Multi Compartment Segmentation

[WSI directory](https://dsa.rc.ufl.edu/#collection/66844a54050c5f92f05d9f63/folder/668567806b968422781c67f3)
[Batch process](https://dsa.rc.ufl.edu/#job/668b31f0dbe4a8eb5249b310)
[Failed job](https://dsa.rc.ufl.edu/#job/668b31f1dbe4a8eb5249b31f): `CUDA OOM`
[Failed job](https://dsa.rc.ufl.edu/#job/668b31f1dbe4a8eb5249b323): `CUDA OOM`
Apptainer Image: `sayatmimar_compreps_multic-hpg_1.sif`
[Model](https://dsa.rc.ufl.edu/#item/664d3f7ee22f2b8123c365c1)

It is a batch process to perform MC segmentation comprising 40 virtual slides in the directory. After starting the batch process, noticed 7 jobs started running although it is a GPU job and I see arguments "--partition=gpu --gres=gres:gpu:a100:1 --cpus-per-task=8". Two jobs immediately failed because of CUDA OOM and the rest seems running.  Although 7 jobs are in running mode but in closer inspection, I see that only 3 are actually performing segmentation (means they are leveraging GPUs) and the rest 4 are waiting for GPU availability. This makes sense as we have 3 GPU resources available in pinaki.sarder-dsa group. Also, single GPU jobs seem faster (~5 mins per slide) than what I used to observe in Pubcontainers though need better quantification.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Batch process Multi Compartment Segmentation #3

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Batch process Multi Compartment Segmentation #3

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions