Skip to content

fix: use feature-gates= (set) instead of += (append) in k3s v1.32 config#326

Merged
SebTardif merged 1 commit into
mainfrom
fix/e2e-k8s132-feature-gate-race
Jun 13, 2026
Merged

fix: use feature-gates= (set) instead of += (append) in k3s v1.32 config#326
SebTardif merged 1 commit into
mainfrom
fix/e2e-k8s132-feature-gate-race

Conversation

@SebTardif

Copy link
Copy Markdown
Contributor

Problem

The E2E nightly run on 2026-06-10 (run) failed on K8s v1.32 because the pods/resize subresource never registered despite the k3s config file correctly containing feature-gates+=InPlacePodVerticalScaling=true.

The += (append) syntax in the k3s config file has a timing race in k3s's GetArgs() function (pkg/util/args.go): under CI resource contention, += tries to append to the existing feature gates map before k3s has populated its defaults, causing the feature gate to silently fail to register. This is the same class of bug as #296. PR #297 moved from CLI args to config file and PR #300 switched to += to preserve k3s defaults, but the race persists because config file args go through the same GetArgs() code path.

Fix

Switch from feature-gates+= (append) to feature-gates= (set) in the k3s config file. The = syntax sets the feature gate directly without depending on any existing state being populated first, bypassing the race entirely.

This is safe for K8s 1.32: k3s's default apiserver feature gates at this version are all GA (locked on/off, not present in --feature-gates), so there are no alpha/beta defaults to preserve.

Also improves the verify step's diagnostic output by replacing the useless ps aux | grep kube-apiserver (k3s embeds the apiserver in its own process) with targeted diagnostics: config file dump, API resources check, and filtered k3s logs.

Closes #325

The += syntax in the k3s config file has a timing race in k3s's
GetArgs() function: under CI resource contention, += tries to append
to the existing feature gates map before it is populated, causing
InPlacePodVerticalScaling=true to silently fail to register. The
pods/resize subresource never appears and the E2E run fails.

Switch to = (set) which sets the feature gate directly without
depending on any existing state. This is safe for K8s 1.32 because
k3s's default apiserver feature gates at this version are all GA
(locked on/off, not present in --feature-gates).

Also improve the verify step's diagnostic output: replace the
useless ps aux | grep kube-apiserver (k3s embeds the apiserver)
with targeted diagnostics (config file dump, API resources check,
filtered k3s logs).

Closes #325

Signed-off-by: Sebastien Tardif <sebtardif@ncf.ca>
@github-actions github-actions Bot added size/s 10-49 lines changed area/ci CI/CD workflows labels Jun 13, 2026
@SebTardif SebTardif merged commit d8bb805 into main Jun 13, 2026
33 checks passed
@SebTardif SebTardif deleted the fix/e2e-k8s132-feature-gate-race branch June 13, 2026 11:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/ci CI/CD workflows size/s 10-49 lines changed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Nightly failed on 2026-06-10

1 participant