You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have searched existing issues and this is not a duplicate
I understand this issue needs status:approved before a PR can be opened
Problem Description
Operators need visibility into old or stale events that may require human review, especially collection-failure events that can remain open for a long time after metrics or relationships change.
The business rule is that events should not be silently closed by the system without operator/monitorist review. However, if old events are never surfaced again, they can be forgotten and continue aging indefinitely.
This came up while planning #152: SNMP no-response collection failures should be fixed going forward, but historical stale/orphan events should not be auto-cleaned without human intervention.
Proposed Solution
Add an operator-facing reminder/recommendation mechanism for stale events.
Possible behavior:
Detect candidate stale events using safe criteria, for example:
old OPEN or ACK events beyond a configurable age;
collection-failure events whose metric/CI relationship no longer exists;
events not refreshed for a long time;
initially focused on SNMP collection-failure events, later extensible to other event families.
Show them in a notifications/recommendations section instead of closing them automatically.
Include enough context for the operator to decide:
event id/title/severity/status;
CI and metric, when still resolvable;
age, last_seen, and reason for recommendation;
suggested action such as review/close/recover if applicable.
Require explicit operator action for any closure/recovery.
Record audit metadata when an operator acts on a recommendation.
Keep the recommendation process dry-run/read-only by default.
Pre-flight Checks
Problem Description
Operators need visibility into old or stale events that may require human review, especially collection-failure events that can remain open for a long time after metrics or relationships change.
The business rule is that events should not be silently closed by the system without operator/monitorist review. However, if old events are never surfaced again, they can be forgotten and continue aging indefinitely.
This came up while planning #152: SNMP no-response collection failures should be fixed going forward, but historical stale/orphan events should not be auto-cleaned without human intervention.
Proposed Solution
Add an operator-facing reminder/recommendation mechanism for stale events.
Possible behavior:
OPENorACKevents beyond a configurable age;Affected Area
Other
Alternatives Considered
Additional Context
Related to #152, but should be implemented separately so the SNMP no-response severity fix remains focused.