You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
DCGM_FI_PROF_SM_OCCUPANCY could be a substitute, but it is disabled by default in kubectl exec -it nvidia-dcgm-exporter-rh46x -- cat /etc/dcgm-exporter/dcp-metrics-included.csv | less
Hello,
I am kinda in a rabbit hole:
DCGM_FI_DEV_GPU_UTILis not supported for MIG devicesDCGM_FI_DEV_GPU_UTIL with MIG devices DCGM#80 (comment)
DCGM_FI_PROF_SM_OCCUPANCYcould be a substitute, but it is disabled by default inkubectl exec -it nvidia-dcgm-exporter-rh46x -- cat /etc/dcgm-exporter/dcp-metrics-included.csv | lessTo enable
DCGM_FI_PROF_*I found this issue, but the refferred piece of documentation is gone:GPU-operator doesn't allow to specify a volume to mount metrics file for nvidia-dcgm-exporter #275 (comment)
Anybody managed to monitor MIG devices memory utilization?
Anybody managed to configure custom metrics for dgcm-exporter?
Thank you.