diff --git a/README.md b/README.md index 3bbc422b..9eed5a45 100644 --- a/README.md +++ b/README.md @@ -81,19 +81,20 @@ eval-containers run aime --task-id 0 --agent codex --mode job ### Kubernetes (`--mode job`) -Every benchmark renders from one shared [Helm](https://helm.sh/) chart (`benchmarks/_chart`) — select it with `--set benchmark=` and apply, no CLI needed. A benchmark with bespoke topology (extra Deployments/sidecars) adds a `presets/.yaml` inside the chart; standard ones need nothing: +Every benchmark renders from one shared [Helm](https://helm.sh/) chart — select it with `--set benchmark=` and apply. The chart is self-contained: each benchmark's bespoke topology ships inside it as a `presets/.yaml` (standard ones need nothing), so it pulls straight from the registry — **no clone**: ```bash -helm template aime benchmarks/_chart --set benchmark=aime \ - --set agent=claude-code,task=0 | kubectl apply -f - +# From the published chart (see Pre-release note above) — no clone needed: +helm template aime oci://quay.io/eval-containers/charts/eval --version 0.1.0 \ + --set benchmark=aime --set agent=claude-code --set task=0 | kubectl apply -f - ``` -The CLI does exactly that, mapping every axis to a `--set`: +Working in a clone, render the local chart instead — which is what the CLI builds today, mapping every axis to a `--set`: ```bash eval-containers run aime --agent codex --task-id 42 --mode job -# → helm template aime-codex-task-42 benchmarks/_chart --set benchmark=aime \ -# --set registry=…,agent=codex,task=42 | kubectl apply -f - +# → helm template aime-codex-task-42 ./benchmarks/_chart --set benchmark=aime \ +# --set registry=… --set agent=codex --set task=42 | kubectl apply -f - ``` Platform specifics (corp registry, NodeAffinity, NetworkPolicies, a different service account, ...) are a Helm **values file you own**, layered on with `--overlay` (an extra `helm -f`), so the eval axes and your platform settings merge. A ready-to-adapt OpenShift overlay (sets the `anyuid` service account) ships as [`deploy/values-openshift.yaml`](deploy/values-openshift.yaml): diff --git a/deploy/openshift-service-account.yaml b/deploy/openshift-service-account.yaml index 2e76fa96..ddbe9758 100644 --- a/deploy/openshift-service-account.yaml +++ b/deploy/openshift-service-account.yaml @@ -9,7 +9,7 @@ # Then deploy any benchmark with the OpenShift values layered on: # # helm template benchmarks/_chart --set benchmark= \ -# -f deploy/values-openshift.yaml --set agent=,task= | oc apply -f - +# -f deploy/values-openshift.yaml --set agent= --set task= | oc apply -f - apiVersion: v1 kind: ServiceAccount metadata: diff --git a/docs/concepts/the-helm-chart.md b/docs/concepts/the-helm-chart.md index f823a52b..b9fe5d52 100644 --- a/docs/concepts/the-helm-chart.md +++ b/docs/concepts/the-helm-chart.md @@ -45,7 +45,7 @@ Full field list: [Chart values reference](../reference/chart-values.md). ```bash helm template aime benchmarks/_chart --set benchmark=aime \ - --set agent=claude-code,task=0 | kubectl apply -f - + --set agent=claude-code --set task=0 | kubectl apply -f - ``` The `eval-containers run … --mode job` command builds exactly this, mapping each diff --git a/docs/guides/deploy-on-kubernetes.md b/docs/guides/deploy-on-kubernetes.md index 774b82e0..9d825a95 100644 --- a/docs/guides/deploy-on-kubernetes.md +++ b/docs/guides/deploy-on-kubernetes.md @@ -22,7 +22,7 @@ Plain Helm — no CLI required: ```bash helm template aime benchmarks/_chart --set benchmark=aime \ - --set agent=claude-code,task=0 | kubectl apply -f - + --set agent=claude-code --set task=0 | kubectl apply -f - ``` Or with the CLI, which builds the same command: