docs: KEEP-573 four-dashboard observability spec for review#1318
Open
OleksandrUA wants to merge 2 commits into
Open
docs: KEEP-573 four-dashboard observability spec for review#1318OleksandrUA wants to merge 2 commits into
OleksandrUA wants to merge 2 commits into
Conversation
Draft spec for the four-dashboard rebuild called out in KEEP-573 Phase 1 E. Splits the current single "KeeperHub" dashboard by audience: - A. Managed Client SLO (exec + Sky/Ajna account team) - B. Platform Health (TechOps/DevOps on-call) - C. Customer Workflows (per-org support debugging) - D. Growth + Revenue (founders + revenue side) Each dashboard section: variables, panel list with PromQL/LogQL, linked alerts, owner suggestion, open questions. Implementation plan targets the new grafana git-sync path adopted on 2026-05-20, with files landing under grafana/keeperhub-dashboards/git-sync/. Open questions for the review at the bottom of the doc. Closes the Phase 1 E acceptance item on the parent ticket. Owners finalized in this PR's review.
Bulk error_type reclassification (backfill re-run, classifier rule change, manual SQL fix) makes the DB-sourced gauge's error_type label-value series move at one scrape. PromQL's increase() reads the gain as new errors while the loss is treated as a counter reset, producing a phantom positive bump that contaminates SLI panels for one window-length. Documents the symptom + how to recognise it so future engineers don't chase a phantom incident, and references the KEEP-592 analysis.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Draft of the four-dashboard observability spec called out in KEEP-573 Phase 1 E.
Splits the current single "KeeperHub" dashboard by audience: SLO exec view, platform on-call view, per-org customer-support view, growth/revenue view. Each section in the doc gives variables, panel list with PromQL/LogQL queries, linked alerts, suggested owner, open questions.
Closes the last unticked Phase 1 acceptance item on the parent ticket. Implementation lands in follow-up PRs in
grafana/keeperhub-dashboards/git-sync/(new path adopted 2026-05-20).Review asks
Test plan