Skip to content

fix: Adjust reconcile duration histogram buckets for millisecond-level precision#56

Merged
pando85 merged 3 commits intomasterfrom
fix/issue-54-adjust-metrics-buckets
Mar 5, 2026
Merged

fix: Adjust reconcile duration histogram buckets for millisecond-level precision#56
pando85 merged 3 commits intomasterfrom
fix/issue-54-adjust-metrics-buckets

Conversation

@forkline-bot
Copy link

@forkline-bot forkline-bot bot commented Mar 5, 2026

Summary

  • Adjusted histogram bucket boundaries to capture millisecond-level reconcile durations
  • Previous buckets (0.1, 0.5, 1.0, 5.0, 10.0 seconds) were too coarse for actual reconcile times averaging ~0.9ms
  • New buckets provide better granularity: 0.001s, 0.005s, 0.01s, 0.025s, 0.05s, 0.1s, 0.25s, 0.5s, 1.0s, 2.5s, 5.0s, 10.0s

Problem

The current histogram buckets were not useful for monitoring because all reconcile operations completed in less than 0.1 seconds (the smallest bucket). This made it impossible to observe performance variations in the typical reconcile duration range.

Example metrics showed:

robotlb_reconcile_duration_seconds_bucket{controller="robotlb",le="0.1"} 111
robotlb_reconcile_duration_seconds_bucket{controller="robotlb",le="0.5"} 111
robotlb_reconcile_duration_seconds_bucket{controller="robotlb",le="1"} 111
...
robotlb_reconcile_duration_seconds_sum{controller="robotlb"} 0.00090932

All values were in the first bucket, with an average of ~0.9ms.

Solution

New bucket boundaries in seconds:

  • Sub-millisecond: 0.001 (1ms)
  • Milliseconds: 0.005 (5ms), 0.01 (10ms), 0.025 (25ms), 0.05 (50ms)
  • Hundreds of milliseconds: 0.1 (100ms), 0.25 (250ms), 0.5 (500ms)
  • Seconds: 1.0, 2.5, 5.0, 10.0

This captures the typical sub-10ms reconcile times while still tracking potential outliers in the second range.

Resolves: #54

…l precision

The previous bucket boundaries (0.1, 0.5, 1.0, 5.0, 10.0 seconds) were too
coarse for the actual reconcile durations which average around 0.9ms.
All values were falling into the first bucket (< 0.1s), making the
histogram uninformative.

New buckets provide better granularity:
- 0.001s (1ms), 0.005s (5ms), 0.01s (10ms), 0.025s (25ms), 0.05s (50ms)
- 0.1s (100ms), 0.25s (250ms), 0.5s (500ms), 1.0s, 2.5s, 5.0s, 10.0s

This captures the typical sub-10ms reconcile times while still tracking
outliers in the second range.

Resolves: #54
@pando85
Copy link

pando85 commented Mar 5, 2026

@forkline we don't need so many values. Reduce the number of ranges in the bucket, do it in a useful way, select the most relevant for the operations

Simplified bucket boundaries from 12 to 5 values:
- 0.001s (1ms): typical reconcile time (~0.9ms avg)
- 0.01s (10ms): slightly slower operations
- 0.1s (100ms): slow reconciles
- 1.0s: significant outliers
- 10.0s: extreme cases

This maintains useful granularity for monitoring while reducing
cardinality.
Copy link
Author

@forkline-bot forkline-bot bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Reduced histogram buckets from 12 to 5 essential values:

  • 0.001s (1ms): Typical reconcile time (~0.9ms average)
  • 0.01s (10ms): Slightly slower operations (10x typical)
  • 0.1s (100ms): Slow reconciles (100x typical)
  • 1.0s: Significant outliers (1000x typical)
  • 10.0s: Extreme cases

This provides a clean logarithmic distribution that captures the full performance range while minimizing cardinality.

@forkline-bot
Copy link
Author

forkline-bot bot commented Mar 5, 2026

CI Fix Applied

I've pushed a fix for the CI failure. The changes should trigger a new CI run.

Commit:

Waiting for CI to re-run...

@pando85 pando85 merged commit 0cf09b8 into master Mar 5, 2026
1 check passed
@pando85 pando85 deleted the fix/issue-54-adjust-metrics-buckets branch March 5, 2026 23:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Adjust metrics buckets to real values

1 participant