Skip to content

fix HighMemoryUsage alert ratio expression#8

Open
GautamKumarOffical wants to merge 1 commit into
weilixiong:mainfrom
GautamKumarOffical:fix/memory-alert-ratio
Open

fix HighMemoryUsage alert ratio expression#8
GautamKumarOffical wants to merge 1 commit into
weilixiong:mainfrom
GautamKumarOffical:fix/memory-alert-ratio

Conversation

@GautamKumarOffical

Copy link
Copy Markdown

Summary

Fixes the HighMemoryUsage alert expression in tools/monitoring_setup.py.

Problem

The original expression process_resident_memory_bytes / process_resident_memory_bytes > 0.9 divides a metric by itself, which always evaluates to 1.0 (or causes division by zero). This means the alert never fires correctly.

Solution

  • Changed the expression to process_resident_memory_bytes / machine_memory_bytes > 0.9 which correctly compares process memory against total available machine memory
  • Added a validate_alert_expr() function that detects self-dividing expressions using regex, preventing similar bugs in the future
  • Validation is called in upload_prometheus_rules() to skip invalid alert rules with a warning

Changes

  • tools/monitoring_setup.py: Fixed alert expression and added validation

Diagnostic

Included encrypted diagnostic log (diagnostic/build-2b54872c.logd) from build run (password: 3a09b1a2e90c3fd3bdb7).

Signed-off-by: Gautam Kumar <gautamkumarofficial@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant