Skip to content

fix(ci): replace OC-cluster e2e gate with local docker compose run#203

Open
elronbandel wants to merge 1 commit into
mainfrom
fix/e2e-gate-zerostack-gpt54
Open

fix(ci): replace OC-cluster e2e gate with local docker compose run#203
elronbandel wants to merge 1 commit into
mainfrom
fix/e2e-gate-zerostack-gpt54

Conversation

@elronbandel

Copy link
Copy Markdown
Contributor

Problem

The per-PR E2E gate (oc-connectivity.yml) was slow and flaky because it depended on an OpenShift cluster:

  • Slow: the codex node image takes a long time to build on OC.
  • Flaky: the bifrost gateway has a boot-time service-discovery race; when the gateway is unhealthy, every PR goes red independent of the PR diff.

This caused every contributor to chase cluster state rather than code quality.

Fix

Replace the OpenShift-based gate entirely with a local docker compose run on the ubuntu-latest GitHub Actions runner — the same pattern nightly-replay.yml already uses successfully.

What the new gate does:

  1. Builds four images via the CLI: bench aime, agent zerostack, model gpt-5.4, eval aime --agent zerostack
  2. Stands up the aime/zerostack/gpt-5.4 compose stack with an output bind-mount override so result files are readable on the host after compose exits
  3. Asserts: task/result.json exists, agent actually ran (started_at ≠ ended_at), and gen_ai spans are present (warn-only)

No OC_SERVER / OC_TOKEN secrets needed. Secrets required: HF_TOKEN (aime dataset build), OPENAI_API_KEY + OPENAI_API_BASE (gpt-5.4 Azure endpoint), GITHUB_TOKEN (GHCR pull).

Rules checked

  • .agents/contributing/RULES.md — code-only change (no rules modified); rules declared here (R-2, R-3).
  • .agents/RULES.md — change is scoped to one concern (the per-PR CI gate).

Closes #200

The per-PR gate was slow (codex node image build) and flaky (bifrost
gateway discovery boot race) because it depended on an OpenShift cluster.
Replace it entirely with a local docker-based run on the ubuntu-latest
runner: build the four images (bench/agent/model/eval) via the CLI, stand
up the aime/zerostack/gpt-5.4 compose stack with an output bind mount, and
assert on task result.json, agent duration, and gen_ai spans. No OC_SERVER
/ OC_TOKEN secrets needed.

Closes #200
@elronbandel elronbandel force-pushed the fix/e2e-gate-zerostack-gpt54 branch from dc13615 to 6343404 Compare June 21, 2026 08:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: e2e — per-PR gate is slow/flaky; switch aime/codex/bifrost → aime/zerostack/gpt-5.4

1 participant