Skip to content

Add csi-volume-device-exporter: Prometheus exporter for CSI volume-to-device mapping#1039

Closed
sradco wants to merge 1 commit into
csi-addons:mainfrom
sradco:feat/csi-volume-device-exporter
Closed

Add csi-volume-device-exporter: Prometheus exporter for CSI volume-to-device mapping#1039
sradco wants to merge 1 commit into
csi-addons:mainfrom
sradco:feat/csi-volume-device-exporter

Conversation

@sradco
Copy link
Copy Markdown

@sradco sradco commented May 14, 2026

Summary

  • Add a new csi-volume-device-exporter binary that discovers CSI volume to block device mappings on each node and exposes them as Prometheus metrics (csi_volume_node_device_info)
  • This enables correlating storage path health metrics (DM-multipath state, NVMe-oF subsystem health) with Kubernetes PersistentVolumes for proactive alerting on storage path degradation
  • Ships with 5 Prometheus alert rules (multipath degraded/lost, NVMe-oF degraded/lost, exporter down), complete with runbooks and promtool-based unit tests

Components added

Directory Purpose
cmd/csi-volume-device-exporter/ Main entry point (standalone binary, DaemonSet)
internal/exporter/discovery/ Volume discovery engines: kubelet (universal), Trident, HPE
internal/exporter/monitoring/ Prometheus metrics and typed alert rule definitions
deploy/exporter/ DaemonSet, PodMonitor, OpenShift SCC manifests
docs/csi-volume-device-exporter/ Feature docs, 5 alert runbooks
hack/prom-rule-ci/ promtool lint + unit test scripts
build/Containerfile.exporter Container image build
test/e2e/exporter/ End-to-end tests
tools/generate-exporter-rules/ Alert YAML codegen

Discovery engines

  1. Kubelet — Parses vol_data.json + /proc/1/mountinfo + /sys/dev/block/ to map CSI volumes to block devices; handles filesystem, block, and publish-mount volumes
  2. Trident — Reads NetApp Trident tracking JSON files
  3. HPE — Reads HPE CSI deviceInfo.json files

All discoverers resolve LUKS-over-multipath stacks, returning the underlying multipath device.

Based on: openshift-virtualization/csi-volume-device-exporter#1

Test plan

  • Unit tests pass: go test -race ./internal/exporter/...
  • Build succeeds: go build ./cmd/csi-volume-device-exporter/...
  • go vet clean
  • CI passes
  • E2E test on a node with CSI volumes: go test -tags=e2e ./test/e2e/exporter/
  • promtool alert rule tests: hack/prom-rule-ci/verify-rules.sh
  • Container image builds: make docker-build-exporter

Made with Cursor

…-device mapping

Add a new csi-volume-device-exporter binary that discovers CSI volume
to block device mappings on each node and exposes them as Prometheus
metrics. This enables correlating storage path health metrics (multipath
state, NVMe-oF subsystem health) with Kubernetes PersistentVolumes.

Components:
- cmd/csi-volume-device-exporter: main entry point
- internal/exporter/discovery: volume discovery engines (kubelet,
  trident, HPE) with sysfs/mountinfo resolution
- internal/exporter/monitoring: Prometheus metrics and alert rules
- deploy/exporter: DaemonSet, PodMonitor, and OpenShift SCC manifests
- docs/csi-volume-device-exporter: feature docs and alert runbooks
- hack/prom-rule-ci: promtool-based alert rule tests
- build/Containerfile.exporter: container image build
- test/e2e/exporter: end-to-end tests

Alert rules:
- CSIVolumeMultipathDegraded/Lost: multipath path health
- CSIVolumeNVMeSubsystemDegraded/Lost: NVMe-oF subsystem health
- CSIVolumeDeviceExporterDown: exporter availability

Based on: openshift-virtualization/csi-volume-device-exporter#1

Co-authored-by: Cursor <cursoragent@cursor.com>
@mergify mergify Bot added the vendor Pull requests that update vendored dependencies label May 14, 2026
@sradco
Copy link
Copy Markdown
Author

sradco commented May 14, 2026

Closing to rework the content before resubmitting.

@sradco sradco closed this May 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

vendor Pull requests that update vendored dependencies

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant