Skip to content

Add vsock support for serving metrics to hypervisors#649

Closed
machadovilaca wants to merge 1 commit into
NVIDIA:mainfrom
machadovilaca:add-vsock-support-for-serving-metrics-to-hypervisor
Closed

Add vsock support for serving metrics to hypervisors#649
machadovilaca wants to merge 1 commit into
NVIDIA:mainfrom
machadovilaca:add-vsock-support-for-serving-metrics-to-hypervisor

Conversation

@machadovilaca
Copy link
Copy Markdown

Add an optional AF_VSOCK listener that serves the same metrics endpoint over virtio-vsock, enabling guest to host metrics collection without network dependencies.

In GPU passthrough and vGPU use cases on KubeVirt (and other hypervisors), the guest VM is often not configured with a network path to the host, but needs GPU telemetry from dcgm-exporter running inside the guest.

Signed-off-by: machadovilaca <machadovilaca@gmail.com>
@machadovilaca
Copy link
Copy Markdown
Author

machadovilaca commented Apr 1, 2026

/cc @glowkey @nccurry can you assign reviewers 🙏

@nccurry
Copy link
Copy Markdown
Collaborator

nccurry commented Apr 7, 2026

Can you rebases this on the latest in main

Comment thread pkg/cmd/app.go
}

if c.Uint(CLIVSOCKPort) > math.MaxUint32 {
return nil, fmt.Errorf("vsock port must be in range 1-65535")
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the right error message for Uint32?
Maybe something like return nil, fmt.Errorf("vsock port must fit in 32 bits")

@nccurry
Copy link
Copy Markdown
Collaborator

nccurry commented Apr 8, 2026

Can you look in to using a socat sidecar, or one of the other methods for proxying to vock, for this?
e.g.

  containers:
    - name: dcgm-exporter
      image: nvcr.io/nvidia/k8s/dcgm-exporter:...
      ports:
        - containerPort: 9400
    - name: vsock-proxy
      image: alpine/socat
      command: ["socat", "VSOCK-LISTEN:9400,reuseaddr,fork", "TCP:localhost:9400"]

DCGM Exporter is really intended to channel DCGM input into Prometheus output.

Adding an entire VSOCK endpoint is not an insignificant change to the API for something that could be done externally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants