Skip to content

l4proxy: add Prometheus metrics#431

Open
tannevaled wants to merge 2 commits into
mholt:masterfrom
tannevaled:feat/proxy-metrics
Open

l4proxy: add Prometheus metrics#431
tannevaled wants to merge 2 commits into
mholt:masterfrom
tannevaled:feat/proxy-metrics

Conversation

@tannevaled

Copy link
Copy Markdown

What

Adds Prometheus metrics to the layer4 proxy, which previously exposed none:

  • caddy_layer4_proxy_connections_total{upstream} — counter of proxied connections
  • caddy_layer4_proxy_active_connections{upstream} — gauge of in-flight proxied connections
  • caddy_layer4_proxy_upstream_healthy{upstream} — gauge (1/0) reflecting active health-check state

Why

Observability is the one big thing the L4 proxy lacked versus more mature TCP load balancers — no way to see connection counts or upstream health from metrics. These three cover the basics (load + health) per upstream.

Notes (re: de-duplication)

This deliberately does not copy the HTTP server's metrics code — it uses the same mechanism (ctx.GetMetricsRegistry() + promauto.With(registry)), so collectors live on the per-instance registry, reset cleanly across reloads, and surface on the existing admin /metrics endpoint. Updates are nil-safe (a never-provisioned handler doesn't panic). Labels are kept to a single upstream to bound cardinality; happy to adjust names/labels to your conventions.

Tests

metrics_test.go: counter/gauge increments and the open/close lifecycle, the health gauge, nil-safety, and that an active health check sets the health gauge to 0 (dead peer) and 1 (live listener). go test ./modules/l4proxy/ passes; gofmt / go vet / golangci-lint clean. go.mod change is just promoting prometheus/client_golang from indirect to direct.

The layer4 proxy exposed no metrics. Add per-upstream collectors:

  - caddy_layer4_proxy_connections_total (counter)
  - caddy_layer4_proxy_active_connections (gauge)
  - caddy_layer4_proxy_upstream_healthy (gauge, from active health checks)

Collectors are registered on the instance metrics registry obtained from
the Caddy context (ctx.GetMetricsRegistry), the same mechanism the HTTP
server uses, so they reset cleanly across config reloads and surface on the
existing admin /metrics endpoint. Metric updates are nil-safe so a handler
that was never provisioned (e.g. in tests) does not panic.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@mholt

mholt commented Jun 2, 2026

Copy link
Copy Markdown
Owner

Thank you for all your contributions lately. I imagine you're using an LLM to assist?

We will look at them soon!

@tannevaled

tannevaled commented Jun 2, 2026

Copy link
Copy Markdown
Author

Hi @mholt — thanks for the kind note! Yes, co-authored with Claude Opus 4.8. I design, test, and review every change myself before submitting. These PRs implement features I'm missing downstream, so the work is purposeful and scoped. Happy to slow down if the review load is a concern — no rush on your side!

The metric helpers are already at 100% coverage; this adds a Metrics
section to docs/handlers/proxy.md listing the exposed collectors.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants