l4proxy: add rise/fall thresholds to active health checks#427
Open
tannevaled wants to merge 2 commits into
Open
l4proxy: add rise/fall thresholds to active health checks#427tannevaled wants to merge 2 commits into
tannevaled wants to merge 2 commits into
Conversation
The active health checker flipped a peer's health on every single check result, which makes it flap on a transient blip. Add rise/fall thresholds (HAProxy-style): a peer is marked unhealthy only after `fall` consecutive failed checks and healthy again only after `rise` consecutive successful checks. Caddyfile: health_fall <int>, health_rise <int>. Both default to 1, which preserves the existing flip-on-first-result behavior. Streaks are tracked per peer under a mutex and capped at the threshold. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Document the rise/fall active-health-check thresholds in docs/handlers/proxy.md and add a caddyfile_adapt integration test. The recordActiveCheck streak logic is already unit-tested at 100%. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Adds HAProxy-style
rise/fallthresholds to the active health checker:fall(Caddyfilehealth_fall <int>) — number of consecutive failed checks before a peer is marked unhealthy.rise(Caddyfilehealth_rise <int>) — number of consecutive successful checks before an unhealthy peer is marked healthy again.Both default to
1, which is exactly the current behavior (flip on the first result), so this is fully backward compatible.Why
Today the checker calls
setHealthyon every individual check, so a single transient failure (or success) immediately flips a peer's state. That makes health flap on a brief blip — undesirable, e.g. for database failover where a momentary timeout shouldn't trigger a switchover.fall 3/rise 2smooths that out.Implementation
Per-peer consecutive-success / consecutive-failure streaks tracked under a mutex (
recordActiveCheck), capped at the threshold so the counters stay bounded. A streak reachingfallmarks unhealthy; reachingrisemarks healthy.Tests
risefall_test.go: streak logic (incl. reset on opposite result and the default-to-1 behavior), an end-to-end fall threshold viadoActiveHealthCheck, and Caddyfile parsing (happy + duplicate/invalid errors).go test ./modules/l4proxy/passes;gofmt/go vet/golangci-lintclean.