fix(e2e): reconcile cockpit default surface — deterministic e2e matches the five-card product (3 suites green)#53
Merged
Conversation
…is: default surface intact, only cockpit:e2e selectors stale) https://claude.ai/code/session_01AgdV9SKZZP6JbyTBo2gAWZ
…rsation+five-card contract Root cause: scripts/operator-cockpit-e2e.ts was pinned to the superseded inline clarification-option UI (button 'A specific test/command must pass *') and timed out; clarification answering moved into ClarificationPopup and the default surface is conversation + LoopCard (WORKBOOK_v6 GR#11). A second stale expectation was fixed too: the Draft-PR route evaluates the Gemini evidence gate before remote-writes, so the deterministic blocked code is GEMINI_NOT_CONFIGURED, not REMOTE_WRITES_DISABLED. - Drive the popup with the exact user-e2e selector contract (.ck-clar-popup / .ck-clar-q / recommended .ck-chip / 'Answer all' submit) plus the same deterministic fenced-JSON planner fixtures (confidence 62 -> answers -> Ask Until Clear -> 96 unlocks the plan). - Keep ALL full-loop assertions: roadmap -> approve -> execute -> root stage, blocked PR-gate card, pr_blocked operatorView stage, exactly 1 mock run, artifacts registered, no PR URL. - Record the default-surface decision in docs/cockpit-redesign/DEFAULT_SURFACE_DECISION.md (no 3-pane default, no flag, no old-UI restoration). - Evidence: quality-smoke + user-e2e PASS dirs committed. Suites: test:cockpit:e2e PASS (was timeout), quality-smoke PASS, user-e2e PASS 7/7. Gates: typecheck/lint PASS, pnpm test 950 passed (baseline 950, zero regressions). https://claude.ai/code/session_01AgdV9SKZZP6JbyTBo2gAWZ
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Reconcile cockpit default surface + unblock deterministic E2E
Decision (recorded in
docs/cockpit-redesign/DEFAULT_SURFACE_DECISION.md): conversation-first + five-card LoopCard IS the intended default (WORKBOOK_v6 GR#11). No 3-pane default, no flag, no old-UI restoration.Root cause of the
test:cockpit:e2etimeout (confirmed): the script was pinned to two superseded contracts — (1) old inline ClarificationCard buttons, now living inClarificationPopupwith the ≥95% adaptive gate; (2) stale expected PR-gate code (REMOTE_WRITES_DISABLED— but the route evaluates the Gemini gate first, so the deterministic block isGEMINI_NOT_CONFIGURED, exactly what quality-smoke already asserts).Fix: e2e now uses the same selector + fixture contract as the passing user-e2e (popup answer flow, 62→96 confidence fixtures); ALL its other full-loop assertions kept (roadmap→approve→execute, blocked gate card, pr_blocked stage, exactly-1 mock run, artifacts, no PR URL).
Suites at HEAD of this branch:
test:cockpit:e2ePASS (was timing out) ·quality-smokePASS ·user-e2e7/7 PASS — evidence committed (incl. pre-fix diagnosis evidence proving the surface was intact). Gates: typecheck ✅ lint ✅ test 950 passed / 0 failed.Mac-side items remain operator-gated (strict real-smoke terminal condition):
claude login(orAEDEV_PLANNER_FALLBACK=codex), thenAEDEV_COCKPIT_REAL_SMOKE_REQUIRE_P1=1 AEDEV_COCKPIT_REAL_SMOKE_REQUIRE_GEMINI=1 pnpm test:cockpit:real-smoke.https://claude.ai/code/session_01AgdV9SKZZP6JbyTBo2gAWZ
Generated by Claude Code