Skip to content

ci(nightly): fall back to rhdh-hub-rhel9 when next-1.y tag expires#443

Merged
rm3l merged 1 commit into
redhat-developer:mainfrom
rm3l:fix/nightly-fallback-expired-next-tag
Jun 19, 2026
Merged

ci(nightly): fall back to rhdh-hub-rhel9 when next-1.y tag expires#443
rm3l merged 1 commit into
redhat-developer:mainfrom
rm3l:fix/nightly-fallback-expired-next-tag

Conversation

@rm3l

@rm3l rm3l commented Jun 19, 2026

Copy link
Copy Markdown
Member

Description of the change

The next-1.y tags on quay.io/rhdh-community/rhdh can expire (e.g., next-1.8 is already gone), causing the nightly workflow to fail for older release branches.

This updates the nightly workflow to use skopeo inspect at runtime to check whether next-1.y still exists. If the tag has expired, it falls back to quay.io/rhdh/rhdh-hub-rhel9:1.y instead.

This is a general-purpose fix — future tag expirations will be handled automatically without workflow edits.

Which issue(s) does this PR fix or relate to

https://github.com/redhat-developer/rhdh-chart/actions/runs/27792983781/job/82246295904#step:6:3711

How to test changes / Special notes to the reviewer

Checklist

  • For each Chart updated, version bumped in the corresponding Chart.yaml according to Semantic Versioning.
  • For each Chart updated, variables are documented in the values.yaml and added to the corresponding README.md. The pre-commit utility can be used to generate the necessary content. Run pre-commit run --all-files to run the hooks and then push any resulting changes. The pre-commit Workflow will enforce this and warn you if needed.
  • JSON Schema template updated and re-generated the raw schema via the pre-commit hook.
  • Tests pass using the Chart Testing tool and the ct lint command.
  • If you updated the orchestrator-infra chart, make sure the versions of the Knative CRDs are aligned with the versions of the CRDs installed by the OpenShift Serverless operators declared in the values.yaml file. See Installing Knative Eventing and Knative Serving CRDs for more details.

The next-1.y tags on quay.io/rhdh-community/rhdh can expire,
causing the nightly workflow to fail for older release branches.

Use skopeo to check if the tag exists at runtime. If it has expired,
fall back to quay.io/rhdh/rhdh-hub-rhel9:1.y instead.
@rm3l rm3l marked this pull request as ready for review June 19, 2026 11:44
@rm3l rm3l requested a review from a team as a code owner June 19, 2026 11:44
@openshift-ci openshift-ci Bot requested review from gazarenkov and zdrapela June 19, 2026 11:44
@rm3l rm3l changed the title fix(ci): fall back to rhdh-hub-rhel9 when next-1.y tag expires ci(nightly): fall back to rhdh-hub-rhel9 when next-1.y tag expires Jun 19, 2026
@rhdh-qodo-merge

Copy link
Copy Markdown

PR Summary by Qodo

Fix nightly CI by falling back when next-1.y Quay tags expire
🐞 Bug fix ⚙️ Configuration changes 🕐 10-20 Minutes

Grey Divider

Description

• Detect missing next-1.y images at runtime during nightly runs via skopeo inspect.
• Fall back to quay.io/rhdh/rhdh-hub-rhel9:1.y when community tags are gone.
• Pass computed image repository+tag into the backstage chart Helm test args.
Diagram

graph TD
  A["Nightly workflow"] --> B["Compute repo/tag"] --> C{"Tag exists?"}
  C -->|"yes"| D["quay.io/rhdh-community/rhdh"] --> F["Helm chart tests"] --> G["Backstage chart"]
  C -->|"no"| E["quay.io/rhdh/rhdh-hub-rhel9"] --> F
Loading
High-Level Assessment

The following are alternative approaches to this PR:

1. Pin image by digest (immutable) instead of tag probing
  • ➕ Eliminates flakiness from expiring/mutating tags
  • ➕ Improves supply-chain integrity and reproducibility
  • ➖ Requires a reliable way to discover/refresh the digest per branch
  • ➖ Adds maintenance/automation work to keep digests current for nightlies
2. Mirror `next-1.y` tags to a stable org/repo with retention guarantees
  • ➕ Keeps tag-based workflow simple (no runtime probing)
  • ➕ Centralizes retention policy under your control
  • ➖ Requires registry mirroring/publishing automation
  • ➖ Still tag-based; correctness depends on mirror freshness
3. Maintain an explicit branch→image mapping in repo config
  • ➕ Very explicit and easy to reason about in reviews
  • ➕ No external probing dependencies at runtime
  • ➖ Needs ongoing edits as new branches/tags expire
  • ➖ Easy to get stale; defeats the ‘future-proof’ goal

Recommendation: The current approach (runtime skopeo inspect with a stable fallback repo/tag) is the best trade-off for a nightly workflow: it is self-healing for future tag expirations and avoids introducing ongoing maintenance. Consider digest pinning only if reproducibility/security requirements outweigh the extra automation complexity.

Files changed (1) +13 / -4

Other (1) +13 / -4
nightly.yamlAdd runtime Quay tag existence check and fallback image selection +13/-4

Add runtime Quay tag existence check and fallback image selection

• Updates the nightly workflow to compute both image repository and tag. For release branches, it probes the expected 'next-<version>' tag with 'skopeo inspect' and falls back to 'rhdh-hub-rhel9:<version>' when missing, then wires the chosen repo/tag into the backstage chart Helm overrides.

.github/workflows/nightly.yaml

@rm3l rm3l merged commit 0343b3b into redhat-developer:main Jun 19, 2026
7 of 8 checks passed
@rm3l rm3l deleted the fix/nightly-fallback-expired-next-tag branch June 19, 2026 11:45
@sonarqubecloud

Copy link
Copy Markdown

@rhdh-qodo-merge

Copy link
Copy Markdown

Code Review by Qodo

🐞 Bugs (2) 📘 Rule violations (0) 📜 Skill insights (0)

Grey Divider


Informational

1. Redundant registry checks 🐞 Bug ➹ Performance
Description
Because the job is matrixed by chart, the skopeo inspect check (which depends only on
matrix.branch) is executed once per chart per release branch, creating unnecessary repeated
registry calls. This increases workflow runtime and external dependency load for no added value.
Code

.github/workflows/nightly.yaml[R72-85]

+      - name: Compute image repo and tag
        if: steps.check.outputs.exists == 'true'
        id: image
        run: |
          if [[ "${{ matrix.branch }}" == "main" ]]; then
+            echo "repo=rhdh-community/rhdh" >> "$GITHUB_OUTPUT"
            echo "tag=next" >> "$GITHUB_OUTPUT"
          else
-            # release-x.y -> next-x.y
-            echo "tag=next-${BRANCH#release-}" >> "$GITHUB_OUTPUT"
+            version="${BRANCH#release-}"
+            candidate_tag="next-${version}"
+            if skopeo inspect --no-tags "docker://quay.io/rhdh-community/rhdh:${candidate_tag}" &>/dev/null; then
+              echo "repo=rhdh-community/rhdh" >> "$GITHUB_OUTPUT"
+              echo "tag=${candidate_tag}" >> "$GITHUB_OUTPUT"
+            else
Relevance

⭐⭐ Medium

Some precedent to reduce unnecessary API calls (accepted in PR #306), but no clear history about
optimizing matrixed registry checks.

PR-#306
PR-#437

ⓘ Recommendations generated based on similar findings in past PRs

Evidence
The workflow defines a two-dimensional matrix (branch x chart) and runs the registry inspection
inside the matrixed job steps, so it repeats for each chart even though it’s branch-only logic.

.github/workflows/nightly.yaml[38-52]
.github/workflows/nightly.yaml[72-90]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
The `skopeo inspect` tag-existence check is run for every `matrix.chart` job even though the result only depends on `matrix.branch`.

### Issue Context
The matrix fans out across all charts (`fromJson(needs.discover-charts.outputs.charts)`), so the same registry check repeats N times per branch.

### Fix Focus Areas
- .github/workflows/nightly.yaml[38-52]
- .github/workflows/nightly.yaml[72-90]

### Suggested fix approach
Compute the image repo/tag once per branch (e.g., a separate job with a `branch`-only matrix that outputs `repo`/`tag`), then have `test-chart` jobs consume those outputs via `needs`. Alternatively, gate the `skopeo inspect` step so it runs once and is reused (e.g., using a per-branch pre-step job + outputs).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. Fallback masks skopeo failures 🐞 Bug ≡ Correctness
Description
Compute image repo and tag treats any non-zero exit from skopeo inspect as “tag not found” and
falls back to rhdh/rhdh-hub-rhel9, including cases like skopeo not being installed (exit 127) or
transient registry/network failures. This can make nightly runs test the wrong image without
failing, hiding real CI/infrastructure issues and reducing confidence in release-branch validation.
Code

.github/workflows/nightly.yaml[R80-89]

+            version="${BRANCH#release-}"
+            candidate_tag="next-${version}"
+            if skopeo inspect --no-tags "docker://quay.io/rhdh-community/rhdh:${candidate_tag}" &>/dev/null; then
+              echo "repo=rhdh-community/rhdh" >> "$GITHUB_OUTPUT"
+              echo "tag=${candidate_tag}" >> "$GITHUB_OUTPUT"
+            else
+              echo "::warning::Tag ${candidate_tag} not found on quay.io/rhdh-community/rhdh, falling back to quay.io/rhdh/rhdh-hub-rhel9:${version}"
+              echo "repo=rhdh/rhdh-hub-rhel9" >> "$GITHUB_OUTPUT"
+              echo "tag=${version}" >> "$GITHUB_OUTPUT"
+            fi
Relevance

⭐ Low

Repo tends to prefer graceful fallbacks/avoid CI failures (e.g., accepted try/catch in PR #306);
this PR #443 adds fallback-on-failure.

PR-#443
PR-#306

ⓘ Recommendations generated based on similar findings in past PRs

Evidence
The new step runs skopeo inspect and unconditionally falls back on failure, and there is no setup
in the workflow/action ensuring skopeo is installed before this step runs.

.github/workflows/nightly.yaml[72-90]
.github/actions/test-charts/action.yml[25-44]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
The workflow falls back to `rhdh/rhdh-hub-rhel9` on *any* `skopeo inspect` failure, which includes missing `skopeo` and transient/network/auth errors. This can silently change what image is tested.

### Issue Context
The step is intended to fall back only when the `next-<version>` tag truly does not exist.

### Fix Focus Areas
- .github/workflows/nightly.yaml[72-93]

### Suggested fix approach
1. Ensure `skopeo` is available (either install it explicitly in the workflow before use, or fail fast if it’s missing).
2. Only fall back when the error indicates the tag is missing; otherwise fail (or retry) to avoid silently switching images.

Example (bash logic sketch):
- `command -v skopeo >/dev/null || { echo '::error::skopeo not installed'; exit 1; }`
- Capture stderr from `skopeo inspect ...` and if it matches the “manifest unknown/tag not found” case, then fall back; else `exit 1` (or add a small retry loop before failing).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

Qodo Logo

@rhdh-qodo-merge rhdh-qodo-merge Bot added enhancement New feature or request Bug fix labels Jun 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Bug fix enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant