Skip to content

feat(release): end-of-run fleet inventory report (completeness + freshness)#233

Open
elronbandel wants to merge 4 commits into
mainfrom
elron/fleet-report
Open

feat(release): end-of-run fleet inventory report (completeness + freshness)#233
elronbandel wants to merge 4 commits into
mainfrom
elron/fleet-report

Conversation

@elronbandel

Copy link
Copy Markdown
Contributor

What

Adds an end-of-run fleet inventory report to the release workflow's report job. Every release now writes, to the run summary + a durable fleet-inventory.tsv artifact:

  • Inventory by freshness — every published container image bucketed (base / leaf / per-task / combo / standalone), split into built this run vs older (registry push-time vs the run's start), with the older span.
  • Failures this run — failed jobs bucketed by stage, full list in a <details>.
  • Stale-layer warnings::warning annotations for consumable layers (base/combo/standalone) that did NOT refresh this run.

Why

The release's only report was per-leaf size/build-time. Nothing surfaced which images were stale, missing, or failed — so v0.1.0 shipped with ~6,400 stale combo/standalone images (06-18/06-19, pre-otel/gosu fix) and the only way to learn that was inspecting GHCR by hand. This makes "did the release publish a complete, fresh fleet?" answerable at a glance.

Sample (rendered locally against run 27940336791)

category total built this run older older span
base 20 19 1 2026-06-18 → 2026-06-18
leaf 127 124 3 2026-06-16 → 2026-06-16
per-task 663 90 573 2026-06-21 → 2026-06-22
combo 3211 2 3209 2026-06-16 → 2026-06-19
standalone 3200 2 3198 2026-06-18 → 2026-06-19

Failures: 73 — combos 42 · per-task 28 · leaf 2 · compose 1 (matches the run's actual failures exactly).

Verified

  • actionlint clean (shellchecks the run: script).
  • The exact script, run locally against a real run, reproduces the inventory + the 73-failure split.
  • Only CI-specific unknown: whether GITHUB_TOKEN (with packages: read, already granted via packages: write) can list org packages — handled gracefully (empty → "inventory unavailable", never fails the release). Confirmed on the first release run after merge.

Adds actions: read so the report job can read this run's job results (the existing build-time column uses the same API).

…hness)

The release reported only per-leaf size/build-time, so stale, missing, or
failed images stayed invisible — a published-but-stale combo could hide behind
green stages (v0.1.0's combos were ~all 06-18/06-19 and nothing flagged it; it
was found by hand in GHCR).

Add a step to the report job that cross-checks the registry against the run:
- every container package bucketed (base/leaf/per-task/combo/standalone) and
  split built-this-run vs older (push-time vs run start) -> step summary + a
  durable fleet-inventory.tsv artifact;
- failed jobs bucketed by stage;
- ::warning annotations for consumable layers (base/combo/standalone) that did
  not refresh this run.

Adds actions:read so the report job can read this run's job results.

Signed-off-by: Elron Bandel <elron.bandel@ibm.com>
… + failures)

Signed-off-by: Elron Bandel <elron.bandel@ibm.com>
…rge)

Signed-off-by: Elron Bandel <elron.bandel@ibm.com>
… GITHUB_TOKEN fallback

GITHUB_TOKEN can't list org packages (confirmed in CI: total=0; the failure
summary + freshness window DO work on it). Use a GHCR_READ_TOKEN secret (a
read:packages PAT) for the org-packages listing call only, falling back to
GITHUB_TOKEN when unset — inventory then shows 'unavailable', never failing
the release. Drops the temporary debug line.

Signed-off-by: Elron Bandel <elron.bandel@ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant