(note: I tried filling the new bug template -- but I didn't go all the way through and think it's overly complex, may not be appropriate for filing a test failure issue like this)
I tried running the bats test suite for the current HEAD of main. This test fails:
test_gpu_updowngrade.bats
✗ GPUs: upgrade: wipe-state, install-last-stable, upgrade-to-current-dev [5746]
tags: fastfeedback
(from function `iupgrade_wait' in file tests/bats/helpers.sh, line 61,
in test file tests/bats/test_gpu_updowngrade.bats, line 39)
`iupgrade_wait "${TEST_CHART_LASTSTABLE_REPO}" "${TEST_CHART_LASTSTABLE_VERSION}" NOARGS' failed
[...]
2026-04-27T15:13:37.429Z [ 4.7s] iupgrade_wait: start
Release "dra-driver-nvidia-gpu-batssuite" does not exist. Installing it now.
Error: failed to perform "FetchReference" on source: GET "https://gcr.io/v2/k8s-staging-nvidia/charts/dra-driver-nvidia-gpu/manifests/25.12.0-0882da87-chart": GET "https://gcr.io/v2/token?scope=repository%3Ak8s-staging-nvidia%2Fcharts%2Fdra-driver-nvidia-gpu%3Apull&scope=repository%3Ak8s-staging-nvidia%2Fgcr.io%2Fcharts%2Fdra-driver-nvidia-gpu%3Apull&service=gcr.io": response status code 403: denied: Unauthenticated request. Unauthenticated requests do not have permission "artifactregistry.repositories.downloadArtifacts" on resource "projects/k8s-staging-nvidia/locations/us/repositories/gcr.io" (or it may not exist)
TEST_CHART_LASTSTABLE_REPO is currently pointing into the void. I think this did sneak in via #967.
I left a comment here:
ac433f1#r183590630
For the time being, this still works:
TEST_CHART_LASTSTABLE_REPO ?= "oci://ghcr.io/nvidia/k8s-dra-driver-gpu"
TEST_CHART_LASTSTABLE_VERSION ?= "25.12.0-0882da87-chart"
and is IMO the right reference to use for now for the last stable release (we can also refer to NGC, but that requires helm repo add ... in various places).
After trying this, I saw that the install helper breaks for installing the last stable release because this new way of label-based filtering doesn't work for the last stable release.
Should we maybe revert renaming this label key? If we want to stick with the rename
- we need to review documentation to see if commands need adjusting, and potentially document different commands for different versions of the driver (example).
- we need to adjust the test method to dynamically switch the label filter based on the version installed.
For a grace period, we can also apply the old label key/value pair and the new one (that may be the ideal choice).
Interesting insight: these component labels should maybe be considered part of the 'public interface', and hence changing them may (in the future) be considered a breaking change (especially relevant if we were to use semantic versioning).
Steps to Reproduce
$ git rev-parse HEAD
7f0d03d1116a4293e090ae510f634ce1434038e4
$ make image-build-and-copy-to-nodes
[...]
$ TEST_CHART_LOCAL=1 make bats
[...]
bats warning: Executed 25 instead of expected 43 tests
43 tests, 1 failure, 18 not run in 167 seconds
(note: I tried filling the new bug template -- but I didn't go all the way through and think it's overly complex, may not be appropriate for filing a test failure issue like this)
I tried running the bats test suite for the current HEAD of
main. This test fails:TEST_CHART_LASTSTABLE_REPOis currently pointing into the void. I think this did sneak in via #967.I left a comment here:
ac433f1#r183590630
For the time being, this still works:
and is IMO the right reference to use for now for the last stable release (we can also refer to NGC, but that requires
helm repo add ...in various places).After trying this, I saw that the install helper breaks for installing the last stable release because this new way of label-based filtering doesn't work for the last stable release.
Should we maybe revert renaming this label key? If we want to stick with the rename
For a grace period, we can also apply the old label key/value pair and the new one (that may be the ideal choice).
Interesting insight: these component labels should maybe be considered part of the 'public interface', and hence changing them may (in the future) be considered a breaking change (especially relevant if we were to use semantic versioning).
Steps to Reproduce