Skip to content

VEP 254: Guest GPU Metrics via VSOCK #254

@machadovilaca

Description

@machadovilaca

Primary contact (assignee):

/assign @machadovilaca

Current Feature Stage: New

Feature Gate: GPUMetrics

Responsible SIGs:

Primary SIG:
/sig observability

Additional SIGs (optional):
/sig compute

Enhancement link:

Timeline:

Additional context:

Collect GPU metrics (utilization, memory, temperature, power, ECC errors) from inside guest VMs via a virtio-serial channel and expose them as kubevirt_vmi_gpu_* Prometheus metrics from virt-handler. A lightweight guest agent (gpu-metrics-agent) uses NVML to collect metrics and communicates with the host over a newline-delimited JSON protocol. Supports both Linux and Windows guests. See the VEP for full design details.

Important

Please keep this description up to date.

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

Status

Removed from Milestone

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions