Skip to content

feat(policies): verify-images Kyverno policy for factory-built images#47

Open
stxkxs wants to merge 1 commit into
mainfrom
feat/verify-images-policy
Open

feat(policies): verify-images Kyverno policy for factory-built images#47
stxkxs wants to merge 1 commit into
mainfrom
feat/verify-images-policy

Conversation

@stxkxs

@stxkxs stxkxs commented Jun 22, 2026

Copy link
Copy Markdown
Member

Why

The factory signs every image it ships — the operator and all four tenant release workflows run cosign sign (keyless OIDC) + cosign attest. But there was zero verifyImages admission policy anywhere, so a hand-pushed or tampered image was admitted exactly like a signed, gate-approved one. Kyverno is already installed with per-env overlays; this fills the missing half (sign → verify).

What

New policies/kyverno/supply-chain/ group (pure Kustomize, mirroring best-practices / pod-security-standards):

  • verify-images ClusterPolicy scoped to ghcr.io/nanohype/* — foreign images (cert-manager, kyverno, …) aren't matched and pass untouched, so blast radius is the factory's own images.
  • Keyless Cosign attestor matching the release-workflow identity: issuer https://token.actions.githubusercontent.com, subject github.com/nanohype/<repo>/.github/workflows/release.ya?ml@refs/tags/* (operator → release.yaml, tenants → release.yml), verified against public Rekor.
  • required: true, mutateDigest/verifyDigest: false — pure signature verification for the rollout. webhookTimeoutSeconds: 30.
  • ApplicationSet entry kyverno-supply-chain at sync-wave 22.

Rollout (Audit first — deliberate)

Ships in Audit across all three envs: unsigned ghcr.io/nanohype/* images are reported in PolicyReports without blocking. This is the cluster's first image-verification policy; straight-to-Enforce would break admission for any pre-signing or mutable-tag image still running. The overlays make the flip a one-line AuditEnforce change — staging first after a clean audit week, production after staging proves clean.

Verification

task validate (yamllint + kustomize build all environments) — green. Each overlay resolves validationFailureAction: Audit.

A sibling PR mirrors this to aks-gitops (same registry-scoped policy applies on AKS clusters).

The factory signs every image it ships — the operator and all four tenant
release workflows run `cosign sign` (keyless OIDC) and `cosign attest` on
each pushed image. Nothing on the cluster side ever checked those signatures,
so a hand-pushed or tampered image was admitted identically to a signed,
gate-approved one. This adds the admission-time check that closes that gap.

─────────────────────────── What changed ───────────────────────────

policies/kyverno/supply-chain/ (new policy group, pure Kustomize like the
others) — a `verify-images` ClusterPolicy with a base + dev/staging/production
overlays:
  - Scopes to imageReferences `ghcr.io/nanohype/*`. Images outside the org
    registry (cert-manager, kyverno, etc.) are not matched and pass untouched,
    so the blast radius is the factory's own images only.
  - Keyless Cosign attestor matching the release-workflow identity: issuer
    `https://token.actions.githubusercontent.com`, subject regex
    `github.com/nanohype/<repo>/.github/workflows/release.ya?ml@refs/tags/*`
    (operator signs from release.yaml, tenants from release.yml; both on tag
    pushes), verified against public Rekor.
  - required: true, mutateDigest/verifyDigest: false — pure signature
    verification for the rollout, no tag-to-digest rewriting yet.
  - webhookTimeoutSeconds: 30 (registry + Rekor lookups exceed the 10s default).
  - Excludes kube-system/kube-public/kube-node-lease/kyverno from the match.

applicationsets/kyverno-policies.yaml — adds the `kyverno-supply-chain` entry
at sync-wave 22 (after best-practices at 21), same matrix generator and
per-environment overlay path as the existing policy groups.

─────────────────────────── Rollout ───────────────────────────

Ships in Audit mode across all three environments: every unsigned
ghcr.io/nanohype/* image is reported in PolicyReports without blocking
admission. This is deliberate — it is the cluster's first image-verification
policy, and going straight to Enforce before confirming the deployed image set
verifies clean would break admission for any pre-signing or mutable-tag image.
The overlays are structured so the flip to Enforce is a one-line value change
per environment: staging first after a clean audit week, production after
staging proves clean.

Validated with `task validate` (yamllint + kustomize build, all environments).
@github-actions

Copy link
Copy Markdown

CI Results

Check Status
YAML Lint
Environment Kustomize Build
dev
staging
production

All validations passed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant