Skip to content

fix(dashboards): de-hollow the last two persona panels missed by the KSM rewrite#57

Merged
stxkxs merged 1 commit into
mainfrom
dehollow-remaining-panels
Jun 24, 2026
Merged

fix(dashboards): de-hollow the last two persona panels missed by the KSM rewrite#57
stxkxs merged 1 commit into
mainfrom
dehollow-remaining-panels

Conversation

@stxkxs

@stxkxs stxkxs commented Jun 24, 2026

Copy link
Copy Markdown
Member

An o11y audit found two panels still over metrics that emit nothing in prod:

  • agent-founder "Top initiatives" queried agents_agent_invocations_total — a counter nothing in the org registers, grouped by platform_id (a label no series carries). Repointed to top-10 Tenants by kube_customresource_status_field aggregate spend (the metric family the board's other panels already use), fitting the exec/spend theme and populated now.
  • agent-tenants "Platform ready ratio" queried the recording rule agents:platforms:ready_ratio, defined only in the operator's PrometheusRule — inert in prod (no ruler). Replaced with the inline kube_customresource_status_phase form already used on the agent-operator board.

Both boards now render entirely over series that reach AMP. Embedded JSON parses, yamllint + all dashboard overlays build.

…KSM rewrite

An observability audit found two panels still querying metrics that produce nothing
in prod — the same hollowness the agent-* rewrite was meant to eliminate.

agent-founder "Top initiatives" — queried agents_agent_invocations_total, a counter
no operator/portal/agentgateway code ever registers, grouped by platform_id, a label
no series carries. Repointed to a real, populated series that fits the exec board's
spend theme: top-10 Tenants by aggregate spend over the kube-state-metrics
customResourceState projection (Tenant.status.aggregateSpendUsd) — the same metric
family the board's other spend panels already use.

agent-tenants "Platform ready ratio" — queried the recording rule
agents:platforms:ready_ratio, which only the operator's PrometheusRule defines. Prod
has no ruler (no prometheus-operator, no AMP ruler), so the series never exists.
Replaced with the inline form already used on agent-operator: count of Platform CRs
in Ready phase over the total, both from kube_customresource_status_phase.

Both boards now render entirely over series that reach AMP.
@github-actions

Copy link
Copy Markdown

CI Results

Check Status
YAML Lint
Environment Kustomize Build
dev
staging
production
hub

All validations passed.

@stxkxs stxkxs merged commit 5570c17 into main Jun 24, 2026
8 checks passed
@stxkxs stxkxs deleted the dehollow-remaining-panels branch June 24, 2026 01:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant