ci(vv): aggregate Pages publish into one workflow to avoid concurrency cancellations#128
Open
renatosfagundes wants to merge 5 commits into
Open
ci(vv): aggregate Pages publish into one workflow to avoid concurrency cancellations#128renatosfagundes wants to merge 5 commits into
renatosfagundes wants to merge 5 commits into
Conversation
…y cancellations Each of the five module V&V workflows used to deploy its own HTML reports to gh-pages via a `publish-pages` job. With five workflows triggered in parallel on every push to development/main, three publish jobs were cancelled by GitHub Actions' "1 running + 1 queued" concurrency limit (group `gh-pages-deploy`). Net effect: 2-3 of the five module reports never updated on Pages despite the gates passing. Replace with a single aggregator (`publish-vv-reports.yml`) triggered by `workflow_run` on completion of any of the five V&V workflows. Each invocation polls the GitHub API to check whether all five sibling workflows for the same SHA have completed successfully; if not, exits silently. Whichever invocation runs last performs a single deploy that includes every module's artefacts. With one job per push contending for the concurrency group, no cancellations. Removes the `publish-pages` job from: - vv-can.yml - vv-decision.yml - vv-perception.yml - vv-pid-alert.yml - vv-uds.yml The publish-index.yml workflow is unchanged — it continues to deploy the docs/index.html landing page; this aggregator deploys the per-module subdirectories under it. Both share the `gh-pages-deploy` concurrency group, but with at most 2 contenders (index + reports) instead of 6, neither gets cancelled. Tested locally: all five trimmed workflows pass YAML lint, and the aggregator matches the existing artefact-naming convention (`vv-<module>-reports-run<N>`).
…eport URLs Addresses Rian's review of PR #126: 1. Gate deadlock on single-module pushes (HIGH) The aggregator's `paths:` filters mean a commit touching one module's files only triggers that workflow. Previously the gate waited for all 5 workflows to show completed-success for the SHA; the 4 that never ran returned [] from `gh run list`, the gate parsed that as status=unknown, set all_done=false, and exited silently forever — Pages never updated. Now the gate and the download loop treat "no run for this SHA" as "not applicable" and deploy once every triggered workflow has completed successfully. Modules that did not run keep their previous gh-pages content untouched via keep_files: true. 2. URL drift on step-summary browse-links (MEDIUM) The previous per-module publish-pages jobs used destination_dir: vv_<module>/<branch>/, so the live URL was vv_<module>/development/ (or /main/). The aggregator now restores that namespacing by downloading into site/<module>/<branch>/ — the five step summaries (Browse reports: …/vv_<module>/${ref_name}/) resolve again, and dev/main snapshots stay isolated instead of overwriting each other. 3. Stale header comments (LOW) Refreshed the "Reports are published" blurb in all 5 vv-*.yml files and the matching paragraph in publish-index.yml to point readers at publish-vv-reports.yml as the canonical deployer. 4. Workflow-name sync hazard Added inline comments at the workflow_run trigger list and the bash arrays warning that the five names must stay in lockstep. Validation: - yaml.safe_load passes for all 7 touched workflow files. - Dry-runs of the new gate against `main` SHA 5cf8eba (the broken release, 3-of-5 publish-pages cancelled) and `development` SHA b89ba86 (real partial trigger — Perception did not run) both evaluate to DEPLOY and would publish the live artefacts.
…lish ci(vv): aggregate Pages publish into one workflow to avoid concurrency cancellations
The Makefile is in every vv-*.yml `paths:` filter, so this no-op comment bump fires all five V&V workflows on the merge commit. With publish-vv-reports.yml now on development, the aggregator fans them in and produces a single gh-pages deploy commit refreshing vv_<module>/development/ for every module (last refresh was 2026-05-06, before the publish-pages race fix landed). This same commit, carried forward in the upcoming development -> main PR, will trigger the equivalent fan-in on main and finally populate vv_can/main/, vv_decision/main/, and vv_uds/main/ -- which have been 404 since the v2.0.0 release because their publish-pages jobs lost the concurrency race on every push. No build/runtime impact; comment-only change.
chore(vv): trigger V&V workflows to repopulate gh-pages snapshots
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
developmenttomain, fixing the publish-race that left 2-3 of the five module HTML reports stale on every push to a trunk branch since the v2.0.0 release.publish-pagesjobs with a single aggregator workflow (.github/workflows/publish-vv-reports.yml) triggered byworkflow_runon completion of any of the five module V&V workflows.gh run listfor every workflow that ran on the SHA; treats "no run for this SHA" as not-applicable (single-module pushes deploy what ran), and deploys once every triggered workflow completes successfully.vv_<module>/<branch>/), preserving dev/main isolation that the previous per-moduledestination_dirprovided and that every V&V step-summary browse-link still expects.vv-*.ymlfiles and the matching paragraph inpublish-index.ymlto point readers atpublish-vv-reports.yml.Related Issue
Closes #127
Change Type
AEB Areas Affected
Requirements Impacted
Functional Requirements
Non-Functional Requirements
development/mainwhose V&V gates pass must publish the corresponding module reports to GitHub Pages.Artifacts Updated
Validation Evidence
Evidence
Failure observed on main HEAD (5cf8eba) — the v2.0.0 release:
Cancelled jobs entered
completed/cancelled~3 s after start withsteps: []— classic "third arrival to acancel-in-progress: falsegroup cancels the previously pending job" pattern.Dry-run of the fixed aggregator against the same SHA (no deploy side effects,
gh run list+gh run view):Same dry-run on a real partial-trigger push (
b89ba86d02on development, where only 4 of 5 V&V workflows fired because path filters didn't match Perception):Local validation:
yaml.safe_loadpasses on all 7 touched workflow files.Reviewer Notes
Gatestep inpublish-vv-reports.yml: it's the only piece preventing duplicate or empty deploys. The three "exit silently" branches must not error or they cancel their own concurrency slot.name:strings in three places (workflow_runtrigger list, gate's bash array,art_prefixtable). Inline# NOTE:comments flag this for future maintainers; there's no CI check enforcing the sync.concurrency: gh-pages-deployso it serialises withpublish-index.yml. With per-module publish jobs gone, max in-flight on the group is 2 (this one + index) — well below the 1+1 cancel threshold.Risks / Open Points
vv-*.ymlpaths:filter, so merging this PR won't itself trigger any V&V workflow — true end-to-end validation lands with the next source-code commit onmain.name:field silently drops that module unless all three lists inpublish-vv-reports.ymlare updated. Adding a CI check that diffs the lists is a sensible follow-up.gh run downloadreads from per-run artefacts (90-day default). A re-run of the aggregator alone on an old SHA whose artefacts have expired would fail to fetch; acceptable because Pages publication is forward-only.