Skip to content

chore(testing): decide Kubernetes version support policy for CI #13457

@jeffspahr

Description

@jeffspahr

Chore description

We should make an explicit maintainer decision on how kubeflow/pipelines chooses Kubernetes versions for CI and implied support.

Today the repo mostly follows a low/high spot-check policy in CI, but that policy is not documented clearly and recent version-rotation work showed that the decision criteria are ambiguous:

  • upstream Kubernetes support has moved
  • major cloud providers still extend support for older minors
  • Kind's latest release does not cover both ends of that space

This issue is meant to capture the policy decision for future work. It is not proposing that we broaden the CI matrix right now.

Related prior threads:

Current pipelines state

kubeflow/pipelines currently tests a low/high pair rather than a contiguous rolling range. The workflows still have many references to v1.29.2 and v1.34.0 in files such as:

  • .github/workflows/api-server-tests.yml
  • .github/workflows/e2e-test.yml
  • .github/workflows/e2e-test-frontend.yml
  • .github/workflows/integration-tests-v1.yml
  • .github/workflows/kfp-kubernetes-native-migration-tests.yaml
  • .github/workflows/kfp-sdk-client-tests.yml
  • .github/workflows/kfp-webhooks.yml

There is also an internal CI note in AGENTS.md, but it is stale and not a public support statement.

How other Kubeflow repos are operating

Other Kubeflow component repos are not following the exact same pattern:

  • kubeflow/katib currently runs e2e CI on a rolling multi-version range: v1.31.3, v1.32.2, v1.33.1, v1.34.0.
  • kubeflow/trainer currently runs CPU e2e CI on 1.32.3, 1.33.1, 1.34.0, 1.35.0.
  • kubeflow/manifests publishes a platform-level validated floor per release rather than a low/high CI spot-check policy. The Kubeflow AI reference platform 26.03 release documents Kubernetes 1.34+ as validated.

That does not mean pipelines should adopt a broader matrix right now, but it does mean the current pipelines low/high policy is a repo-specific choice and should be explicit.

External support context as of 2026-05-31

Upstream Kubernetes

Upstream supported minors are currently 1.36, 1.35, and 1.34.

  • 1.36.1 latest release
  • 1.35.5 latest release
  • 1.34.8 latest release
  • 1.33 reaches EOL on 2026-06-28

Source:

Major cloud providers

The oldest version still covered by major cloud-provider extended support (or the closest equivalent) is effectively 1.30 right now:

So if we want the low end to reflect the oldest version still covered by major managed Kubernetes providers, the answer is currently 1.30, not 1.31+.

Kind limitation as of 2026-05-31

Our CI cluster creation path still uses helm/kind-action@v1.13.0, which maps to Kind v0.31.0.

The latest Kind release (v0.31.0) currently ships prebuilt node images for:

  • v1.35.0
  • v1.34.3
  • v1.33.7
  • v1.32.11
  • v1.31.14

Source:

This creates two important constraints:

  1. Latest Kind does not yet cover the latest upstream Kubernetes release.
    • Upstream latest is 1.36.x, but latest Kind prebuilt images stop at 1.35.0.
  2. Latest Kind also does not cover the low end of major cloud-provider extended support.
    • Cloud-provider low end is effectively 1.30, but latest Kind prebuilt images start at 1.31.14.

So even if we know what the policy should be, the current Kind release prevents the repo from validating both ends with release-matched prebuilt images.

Questions for maintainers

  1. Should kubeflow/pipelines continue the current low/high spot-check CI policy, or should it intentionally move to a broader rolling range later?
  2. What should define the low end for pipelines?
    • Upstream Kubernetes support only?
    • Oldest version still covered by major cloud-provider extended support?
    • Kubeflow manifests validated floor?
    • Something else?
  3. What should define the high end?
    • Latest upstream supported minor?
    • Latest Kubernetes version with a release-matched Kind prebuilt image?
    • Latest version we can validate another way in CI?
  4. When Kind lags both the upstream latest and the cloud-provider low end, what should the repo do?
    • Stay within latest Kind prebuilt images
    • Bump Kind more aggressively when available
    • Build custom node images or use non-prebuilt images
    • Keep support policy separate from CI validation policy and document the difference explicitly

Suggested outcome

The main goal here is to make future version-rotation PRs mechanical instead of policy debates.

A useful resolution would be a short documented rule in the repo that says:

  • whether pipelines intentionally uses low/high spot checks or a broader rolling range
  • what source of truth determines the low and high Kubernetes versions
  • how Kind release limitations should be handled when they conflict with that policy

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions