diff --git a/docs/SESSION_LOG_v3.md b/docs/SESSION_LOG_v3.md index 18b33a3..7e5b246 100644 --- a/docs/SESSION_LOG_v3.md +++ b/docs/SESSION_LOG_v3.md @@ -1,5 +1,12 @@ # SESSION LOG v3 +## s_v6_0006 · 2026-06-11 · [reconcile-e2e] deterministic cockpit E2E reconciled to the current conversation+five-card contract + +- Root cause: `scripts/operator-cockpit-e2e.ts` was pinned to the superseded inline clarification-option contract (`getByRole('button', { name: 'A specific test/command must pass ★' })`) and timed out — clarification answering moved into `ClarificationPopup` and the primary surface is conversation + LoopCard (WORKBOOK_v6 GR#11). Second stale expectation found and fixed: the Draft-PR gate now evaluates the Gemini evidence gate before remote-writes, so the deterministic blocked code is `GEMINI_NOT_CONFIGURED` (as quality-smoke already asserts), not `REMOTE_WRITES_DISABLED`. +- Fix (no product change, no flags, no old-UI restoration): the e2e now drives the popup with the exact selector contract of `operator-cockpit-user-e2e.ts` (`.ck-clar-popup` / `.ck-clar-q` / recommended `.ck-chip` / "Answer all" submit), uses the same deterministic fenced-JSON planner fixtures (brainstorm confidence 62 → answers → Ask Until Clear follow-up confidence 96 unlocks the plan), then keeps ALL its full-loop assertions: roadmap → approve → execute → root stage → PR gate card blocked + `pr_blocked` operatorView stage + exactly 1 mock run + artifacts + no PR URL. +- Decision recorded in `docs/cockpit-redesign/DEFAULT_SURFACE_DECISION.md`: conversation+five-card is the default; deterministic E2E must track the shipped product. +- Suites: `test:cockpit:e2e` PASS (was timing out), `test:cockpit:quality-smoke` PASS (evidence `evidence/browser-cockpit-quality/2026-06-11T14-18-20-197Z/`), `test:cockpit:user-e2e` PASS 7/7 (evidence `evidence/browser-cockpit-user-e2e/2026-06-11T14-18-36-321Z/`). Gates: typecheck/lint/test — 950 baseline, zero regressions. + ## s_v6_0005 · 2026-06-11 · Overnight harness loop — P1/P3/P4/P5/P6 done · P2 honest HOLD - P1: HOLD-PLANNER-AUTH detection + opt-in AEDEV_PLANNER_FALLBACK=codex (events record codex-cli (fallback), never impersonation) (+18). P3: operator-vocabulary cards + agent strip + on-card actions + PR-gate transparency, user-E2E 7/7 (+17). P4: merge-policy pure function, 864-combination sweep proves GR#10 (auto-merge off) (+14). P5: run-summary.md audit artifact on all four mission exits, absent-means-absent (+12). P6: full uninterrupted 30-min soak 5/5 PASS. diff --git a/docs/cockpit-redesign/DEFAULT_SURFACE_DECISION.md b/docs/cockpit-redesign/DEFAULT_SURFACE_DECISION.md new file mode 100644 index 0000000..9dbb2a3 --- /dev/null +++ b/docs/cockpit-redesign/DEFAULT_SURFACE_DECISION.md @@ -0,0 +1,24 @@ +# Default Surface Decision — conversation + five-card LoopCard + +Date: 2026-06-11 + +Decision (recorded; made by the orchestrator): the **conversation-first cockpit +with the five-card LoopCard surface** (understanding / plan / progress / +blocker / pr_ready) **is the intended default**, per WORKBOOK_v6 GR#11. + +Consequences: + +- Clarification answering happens through the bottom-anchored + `ClarificationPopup` (`.ck-clar-popup` / `.ck-clar-q` / `.ck-chip` + + "Answer all & continue"), not inline option buttons in the thread. +- The Gemini evidence-only hard gate is evaluated before the remote-writes + gate, so the deterministic blocked Draft-PR code with no Gemini verdict is + `GEMINI_NOT_CONFIGURED`. +- `scripts/operator-cockpit-e2e.ts` (deterministic mock/template full-loop + E2E) was updated to this current contract on 2026-06-11; it had been pinned + to the old inline clarification-option contract and the stale + `REMOTE_WRITES_DISABLED` first-block expectation. + +Explicit non-goals: **no 3-pane default, no legacy inline clarification UI +restoration, no feature flag** to switch surfaces. Deterministic E2E +expectations must track the shipped product, not preserve superseded UI. diff --git a/evidence/browser-cockpit-quality/2026-06-11T14-10-56-596Z/01-new.png b/evidence/browser-cockpit-quality/2026-06-11T14-10-56-596Z/01-new.png new file mode 100644 index 0000000..2eda1b9 Binary files /dev/null and b/evidence/browser-cockpit-quality/2026-06-11T14-10-56-596Z/01-new.png differ diff --git a/evidence/browser-cockpit-quality/2026-06-11T14-10-56-596Z/02-brainstorm-ready.png b/evidence/browser-cockpit-quality/2026-06-11T14-10-56-596Z/02-brainstorm-ready.png new file mode 100644 index 0000000..f082080 Binary files /dev/null and b/evidence/browser-cockpit-quality/2026-06-11T14-10-56-596Z/02-brainstorm-ready.png differ diff --git a/evidence/browser-cockpit-quality/2026-06-11T14-10-56-596Z/03-plan-approval.png b/evidence/browser-cockpit-quality/2026-06-11T14-10-56-596Z/03-plan-approval.png new file mode 100644 index 0000000..9792353 Binary files /dev/null and b/evidence/browser-cockpit-quality/2026-06-11T14-10-56-596Z/03-plan-approval.png differ diff --git a/evidence/browser-cockpit-quality/2026-06-11T14-10-56-596Z/04-approved.png b/evidence/browser-cockpit-quality/2026-06-11T14-10-56-596Z/04-approved.png new file mode 100644 index 0000000..b54f4f3 Binary files /dev/null and b/evidence/browser-cockpit-quality/2026-06-11T14-10-56-596Z/04-approved.png differ diff --git a/evidence/browser-cockpit-quality/2026-06-11T14-10-56-596Z/05-evidence-ready.png b/evidence/browser-cockpit-quality/2026-06-11T14-10-56-596Z/05-evidence-ready.png new file mode 100644 index 0000000..1214a5a Binary files /dev/null and b/evidence/browser-cockpit-quality/2026-06-11T14-10-56-596Z/05-evidence-ready.png differ diff --git a/evidence/browser-cockpit-quality/2026-06-11T14-10-56-596Z/06-pr-blocked.png b/evidence/browser-cockpit-quality/2026-06-11T14-10-56-596Z/06-pr-blocked.png new file mode 100644 index 0000000..96c8caa Binary files /dev/null and b/evidence/browser-cockpit-quality/2026-06-11T14-10-56-596Z/06-pr-blocked.png differ diff --git a/evidence/browser-cockpit-quality/2026-06-11T14-10-56-596Z/console-logs.json b/evidence/browser-cockpit-quality/2026-06-11T14-10-56-596Z/console-logs.json new file mode 100644 index 0000000..0637a08 --- /dev/null +++ b/evidence/browser-cockpit-quality/2026-06-11T14-10-56-596Z/console-logs.json @@ -0,0 +1 @@ +[] \ No newline at end of file diff --git a/evidence/browser-cockpit-quality/2026-06-11T14-10-56-596Z/db-state-summary.json b/evidence/browser-cockpit-quality/2026-06-11T14-10-56-596Z/db-state-summary.json new file mode 100644 index 0000000..4d96a06 --- /dev/null +++ b/evidence/browser-cockpit-quality/2026-06-11T14-10-56-596Z/db-state-summary.json @@ -0,0 +1,286 @@ +{ + "mission": { + "id": "01KTVGB8TJ5JBT7FSJ6KS9WPKN", + "status": "paused", + "githubPrUrl": null + }, + "operatorView": { + "stage": "pr_blocked", + "stageLabel": "PR blocked by policy · PR 被安全门拦截", + "confidence": 96, + "progressPercent": 95, + "headlessCallsToday": 0, + "primaryAction": { + "id": "check-draft-pr-gate", + "label": "Re-check Draft PR Gate · 重新检查 PR 安全门", + "kind": "primary" + }, + "secondaryActions": [], + "providerSummary": { + "planner": { + "name": "test-synthetic", + "mode": "mock", + "status": "Planner finished", + "tokens": null + }, + "worker": { + "name": "mock", + "mode": "mock", + "status": "done", + "tokens": null + }, + "validators": [ + { + "name": "gemini", + "mode": "not_configured", + "status": "not_configured" + } + ] + }, + "safetySummary": { + "remoteWrites": "disabled", + "prGate": { + "status": "blocked", + "code": "GEMINI_NOT_CONFIGURED", + "reason": "Gemini hard gate has no evidence-only PASS verdict for this mission.", + "remediation": "Remote writes are disabled for safety. Enable repo-scoped allow_remote_writes only when you want the worker to push a branch and open a Draft PR; until then no push, PR, or merge occurs." + }, + "testMode": { + "enabled": true, + "reason": "mock/template mode is active; no external model or remote write is implied." + } + }, + "understanding": { + "roundsCompleted": 0, + "questions": [], + "readyReason": "Planner confidence is at least 95% and no clarification questions are pending." + }, + "projectPulse": { + "progress": [ + { + "id": "understand", + "label": "Understand · 理解需求", + "status": "done" + }, + { + "id": "roadmap", + "label": "Roadmap · 路线图", + "status": "done" + }, + { + "id": "execute", + "label": "Execute · 本地执行", + "status": "done" + }, + { + "id": "validate", + "label": "Validate · 独立验证", + "status": "done", + "detail": "Gemini key is not configured; this is visible and not counted as pass." + }, + { + "id": "pr-gate", + "label": "PR Gate · PR 安全门", + "status": "active" + }, + { + "id": "learn", + "label": "Learn · 沉淀记忆", + "status": "pending" + } + ], + "workingFolder": "/tmp/aedev-cockpit-quality-WZEC3f/operator-evidence/01KTVGB98SARBSTY7616TDSNZH", + "touchedFiles": [], + "evidence": [ + { + "id": "01KTVGB990AJWY1R5Y0EQ159F3", + "title": "ADR draft", + "path": "/tmp/aedev-cockpit-quality-WZEC3f/evidence/01KTVGB8TJ5JBT7FSJ6KS9WPKN/adr-mission.md", + "type": "adr" + }, + { + "id": "01KTVGB990AJWY1R5Y0EQ159F1", + "title": "Evidence directory", + "path": "/tmp/aedev-cockpit-quality-WZEC3f/evidence/01KTVGB8TJ5JBT7FSJ6KS9WPKN", + "type": "evidence" + }, + { + "id": "01KTVGB990AJWY1R5Y0EQ159F2", + "title": "PRD", + "path": "/tmp/aedev-cockpit-quality-WZEC3f/evidence/01KTVGB8TJ5JBT7FSJ6KS9WPKN/prd.md", + "type": "prd" + }, + { + "id": "01KTVGB990AJWY1R5Y0EQ159F5", + "title": "Workbook summary", + "path": "/tmp/aedev-cockpit-quality-WZEC3f/evidence/01KTVGB8TJ5JBT7FSJ6KS9WPKN/workbook-summary.md", + "type": "report" + }, + { + "id": "01KTVGB990AJWY1R5Y0EQ159F6", + "title": "Test summary", + "path": "/tmp/aedev-cockpit-quality-WZEC3f/evidence/01KTVGB8TJ5JBT7FSJ6KS9WPKN/test-summary.md", + "type": "report" + }, + { + "id": "01KTVGB990AJWY1R5Y0EQ159F7", + "title": "Risk report", + "path": "/tmp/aedev-cockpit-quality-WZEC3f/evidence/01KTVGB8TJ5JBT7FSJ6KS9WPKN/risk-report.md", + "type": "report" + }, + { + "id": "01KTVGB990AJWY1R5Y0EQ159F8", + "title": "Worker diff", + "path": "/tmp/aedev-cockpit-quality-WZEC3f/evidence/01KTVGB8TJ5JBT7FSJ6KS9WPKN/diff-summary.md", + "type": "report" + }, + { + "id": "01KTVGB990AJWY1R5Y0EQ159F9", + "title": "Done report", + "path": "/tmp/aedev-cockpit-quality-WZEC3f/evidence/01KTVGB8TJ5JBT7FSJ6KS9WPKN/done-report.md", + "type": "report" + }, + { + "id": "01KTVGB990AJWY1R5Y0EQ159F4", + "title": "Roadmap", + "path": "/tmp/aedev-cockpit-quality-WZEC3f/evidence/01KTVGB8TJ5JBT7FSJ6KS9WPKN/roadmap.md", + "type": "roadmap" + }, + { + "id": "01KTVGB8TPVKRQZXZ7MVXKCWW1", + "title": "ADR draft in mission design", + "path": "/tmp/aedev-cockpit-quality-WZEC3f/prd/01KTVGB8TJ5JBT7FSJ6KS9WPKN.design.json", + "type": "adr" + }, + { + "id": "01KTVGB8TN73M0G50YNBS0CBQX", + "title": "Mission design JSON", + "path": "/tmp/aedev-cockpit-quality-WZEC3f/prd/01KTVGB8TJ5JBT7FSJ6KS9WPKN.design.json", + "type": "roadmap" + }, + { + "id": "01KTVGB8TN73M0G50YNBS0CBQW", + "title": "PRD", + "path": "/tmp/aedev-cockpit-quality-WZEC3f/prd/01KTVGB8TJ5JBT7FSJ6KS9WPKN.md", + "type": "prd" + } + ], + "validatorReviews": [ + { + "id": "validators-not-configured", + "validator": "validators", + "verdict": "not_configured", + "summary": "Independent validation did not run because the Gemini key is not configured.", + "checkedEvidence": [ + "ADR draft", + "Evidence directory", + "PRD", + "Workbook summary", + "Test summary", + "Risk report", + "Worker diff", + "Done report" + ], + "blockingIssues": [], + "evidenceGaps": [ + "No Gemini validator verdict exists for this mission." + ], + "recommendedNextAction": "Configure validator keys for live verification, or continue reviewing evidence manually." + } + ] + }, + "memorySummary": { + "projectFacts": [ + { + "id": "repo-01KTVGB510MQPKSWH3GB5QFATX", + "kind": "project", + "text": "Target repo is cockpit-quality at /tmp/aedev-cockpit-quality-WZEC3f.", + "provenance": "repo registry", + "ttlDays": 90, + "superseded": false + }, + { + "id": "repo-forbidden-01KTVGB510MQPKSWH3GB5QFATX", + "kind": "safety", + "text": "Forbidden paths stay protected: .env*, secrets/**, .github/**, AGENTS.md", + "provenance": "repo policy", + "ttlDays": 365, + "superseded": false + } + ], + "userPreferences": [ + { + "id": "pref-understand-first", + "kind": "user_preference", + "text": "Ask goal-specific questions and confirm understanding before starting worker execution.", + "provenance": "operator product directive", + "ttlDays": 365, + "superseded": false + }, + { + "id": "prompt-01KTVGB6C2AH6J8YFB0KC1EPYB", + "kind": "mission_intent", + "text": "Current mission intent: In the dashboard Cockpit page, verify the existing conversation UI quality smoke keeps the single conversation layout, status strip, and safe Draft PR gate visible without changing product behavior. Acceptance: browser smoke passes and evid", + "provenance": "operator prompt", + "ttlDays": 30, + "superseded": false + } + ], + "recentLessons": [ + { + "id": "lesson-0", + "kind": "run_lesson", + "text": "Draft PR blocked: GEMINI_NOT_CONFIGURED", + "provenance": "event:operator.draft_pr_blocked", + "ttlDays": 30, + "superseded": false + } + ] + }, + "summary": "Worker done, evidence ready, and the Draft PR gate was blocked by policy. No branch push, PR, or merge occurred.", + "nextAction": "Continue reviewing evidence, or explicitly enable repo-scoped remote writes before re-checking the gate.", + "testMode": true, + "userState": { + "state": "blocked", + "label": "Needs your attention", + "labelZh": "需要你处理", + "explanation": "系统在这一步暂停,等你看一眼后再继续 · The system paused here and will continue once you take a look." + }, + "lastActivity": { + "atIso": "2026-06-11T14:11:02.866Z", + "agoMs": 161, + "phase": "blocked" + }, + "loopSummary": { + "whatChanged": [], + "testsRan": [ + "Test summary" + ], + "agents": [ + "planner · test-synthetic", + "worker · mock", + "validator · gemini" + ], + "validatorSaid": null, + "whyStoppedOrContinuing": "系统在这一步暂停,等你看一眼后再继续 · The system paused here and will continue once you take a look." + }, + "card": { + "type": "blocker", + "title": "需要你处理 · Needs your attention", + "human_explanation": "系统在这一步暂停,等你看一眼后再继续 · The system paused here and will continue once you take a look.", + "why_it_matters": "在不确定的时候暂停,比悄悄做错更安全;没有你的确认,任何东西都不会对外发布 · Pausing when unsure is safer than quietly doing the wrong thing; nothing is published without your confirmation.", + "recovery_actions": [ + "查看这张卡的说明,确认是否继续 · Read this card’s explanation and confirm whether to continue.", + "随时可以重新开始或调整目标 · You can restart or adjust the goal at any time." + ], + "recommended_action": "查看这张卡的说明,确认是否继续 · Read this card’s explanation and confirm whether to continue.", + "next_step": "查看这张卡的说明,确认是否继续 · Read this card’s explanation and confirm whether to continue.", + "machine": { + "user_state": "blocked", + "stage": "pr_blocked", + "hold_code": null, + "pr_gate_code": "GEMINI_NOT_CONFIGURED" + } + } + } +} \ No newline at end of file diff --git a/evidence/browser-cockpit-quality/2026-06-11T14-10-56-596Z/dom-state-summary.json b/evidence/browser-cockpit-quality/2026-06-11T14-10-56-596Z/dom-state-summary.json new file mode 100644 index 0000000..6e84ddc --- /dev/null +++ b/evidence/browser-cockpit-quality/2026-06-11T14-10-56-596Z/dom-state-summary.json @@ -0,0 +1,6 @@ +{ + "stage": "pr_blocked", + "planner": "mock", + "worker": "mock", + "prGateCode": "GEMINI_NOT_CONFIGURED" +} \ No newline at end of file diff --git a/evidence/browser-cockpit-quality/2026-06-11T14-10-56-596Z/event-tail.json b/evidence/browser-cockpit-quality/2026-06-11T14-10-56-596Z/event-tail.json new file mode 100644 index 0000000..ad8e823 --- /dev/null +++ b/evidence/browser-cockpit-quality/2026-06-11T14-10-56-596Z/event-tail.json @@ -0,0 +1,119 @@ +[ + { + "type": "operator.draft_pr_blocked", + "payload": { + "code": "GEMINI_NOT_CONFIGURED", + "reason": "Gemini hard gate has no evidence-only PASS verdict for this mission.", + "validator": "gemini" + }, + "createdAt": "2026-06-11T14:11:02.866Z" + }, + { + "type": "operator.gemini_pr_blocked", + "payload": { + "code": "GEMINI_NOT_CONFIGURED", + "reason": "Gemini hard gate has no evidence-only PASS verdict for this mission.", + "verdict": "not_configured", + "summary": null + }, + "createdAt": "2026-06-11T14:11:02.866Z" + }, + { + "type": "mission.run_completed", + "payload": { + "taskId": "01KTVGB98SARBSTY7616TDSNZH", + "runId": "01KTVGB98T99BG4CW3SWKKR3BX", + "exitCode": 0, + "status": "waiting", + "decision": "WAITING", + "riskScore": 0, + "validatorCount": 0, + "releaseDeployUrl": null, + "releaseReverted": false, + "draftPrUrl": null, + "draftPrNumber": null + }, + "createdAt": "2026-06-11T14:11:01.024Z" + }, + { + "type": "operator.evidence_written", + "payload": { + "sessionId": "01KTVGB6C2AH6J8YFB0KC1EPYB", + "evidenceDir": "/tmp/aedev-cockpit-quality-WZEC3f/evidence/01KTVGB8TJ5JBT7FSJ6KS9WPKN" + }, + "createdAt": "2026-06-11T14:11:01.024Z" + }, + { + "type": "operator.stage_changed", + "payload": { + "stage": "PR/Waiting/Blocked", + "sessionId": "01KTVGB6C2AH6J8YFB0KC1EPYB", + "status": "waiting" + }, + "createdAt": "2026-06-11T14:11:01.024Z" + }, + { + "type": "operator.worker_started", + "payload": { + "taskId": "01KTVGB98SARBSTY7616TDSNZH", + "runId": "01KTVGB98T99BG4CW3SWKKR3BX", + "provider": "mock", + "evidenceDir": "/tmp/aedev-cockpit-quality-WZEC3f/operator-evidence/01KTVGB98SARBSTY7616TDSNZH" + }, + "createdAt": "2026-06-11T14:11:01.018Z" + }, + { + "type": "operator.worker_log", + "payload": { + "taskId": "01KTVGB98SARBSTY7616TDSNZH", + "runId": "01KTVGB98T99BG4CW3SWKKR3BX", + "stream": "stdout", + "chunk": "mock worker completed evidence gate" + }, + "createdAt": "2026-06-11T14:11:01.018Z" + }, + { + "type": "mission.route_selected", + "payload": { + "role": "coder", + "provider": "mock", + "sessionId": null, + "concurrency": 1, + "holdCode": null, + "reason": "worker router not configured" + }, + "createdAt": "2026-06-11T14:11:01.017Z" + }, + { + "type": "mission.run_started", + "payload": { + "evidenceDir": "/tmp/aedev-cockpit-quality-WZEC3f/evidence/01KTVGB8TJ5JBT7FSJ6KS9WPKN" + }, + "createdAt": "2026-06-11T14:11:01.012Z" + }, + { + "type": "operator.worker_assigned", + "payload": { + "sessionId": "01KTVGB6C2AH6J8YFB0KC1EPYB", + "mode": "mock", + "availableSessions": 0, + "paidApiKeysStripped": true + }, + "createdAt": "2026-06-11T14:11:01.011Z" + }, + { + "type": "operator.run_starting", + "payload": { + "sessionId": "01KTVGB6C2AH6J8YFB0KC1EPYB" + }, + "createdAt": "2026-06-11T14:11:01.010Z" + }, + { + "type": "operator.stage_changed", + "payload": { + "stage": "Worker", + "sessionId": "01KTVGB6C2AH6J8YFB0KC1EPYB" + }, + "createdAt": "2026-06-11T14:11:01.010Z" + } +] \ No newline at end of file diff --git a/evidence/browser-cockpit-quality/2026-06-11T14-10-56-596Z/quality-smoke.md b/evidence/browser-cockpit-quality/2026-06-11T14-10-56-596Z/quality-smoke.md new file mode 100644 index 0000000..8a354d8 --- /dev/null +++ b/evidence/browser-cockpit-quality/2026-06-11T14-10-56-596Z/quality-smoke.md @@ -0,0 +1,16 @@ +# Operator Cockpit WebUI Quality Smoke + +Result: PASS +Mission: 01KTVGB8TJ5JBT7FSJ6KS9WPKN +Stage: pr_blocked +PR gate: GEMINI_NOT_CONFIGURED + +Assertions: +- cockpit renders as one conversation column plus the three-part status strip +- legacy Project Pulse, sidebar, inspector, and tabbed panels are absent +- one primary action per stage +- stable testids for core controls +- planner/worker provider badges expose mock test mode +- PR URL stayed empty while Gemini hard gate was not configured +- draft PR blocked card reassures no push, PR, or merge occurred +- browser console had no error/warning \ No newline at end of file diff --git a/evidence/browser-cockpit-quality/2026-06-11T14-18-20-197Z/01-new.png b/evidence/browser-cockpit-quality/2026-06-11T14-18-20-197Z/01-new.png new file mode 100644 index 0000000..a8d2401 Binary files /dev/null and b/evidence/browser-cockpit-quality/2026-06-11T14-18-20-197Z/01-new.png differ diff --git a/evidence/browser-cockpit-quality/2026-06-11T14-18-20-197Z/02-brainstorm-ready.png b/evidence/browser-cockpit-quality/2026-06-11T14-18-20-197Z/02-brainstorm-ready.png new file mode 100644 index 0000000..f082080 Binary files /dev/null and b/evidence/browser-cockpit-quality/2026-06-11T14-18-20-197Z/02-brainstorm-ready.png differ diff --git a/evidence/browser-cockpit-quality/2026-06-11T14-18-20-197Z/03-plan-approval.png b/evidence/browser-cockpit-quality/2026-06-11T14-18-20-197Z/03-plan-approval.png new file mode 100644 index 0000000..9792353 Binary files /dev/null and b/evidence/browser-cockpit-quality/2026-06-11T14-18-20-197Z/03-plan-approval.png differ diff --git a/evidence/browser-cockpit-quality/2026-06-11T14-18-20-197Z/04-approved.png b/evidence/browser-cockpit-quality/2026-06-11T14-18-20-197Z/04-approved.png new file mode 100644 index 0000000..14fb145 Binary files /dev/null and b/evidence/browser-cockpit-quality/2026-06-11T14-18-20-197Z/04-approved.png differ diff --git a/evidence/browser-cockpit-quality/2026-06-11T14-18-20-197Z/05-evidence-ready.png b/evidence/browser-cockpit-quality/2026-06-11T14-18-20-197Z/05-evidence-ready.png new file mode 100644 index 0000000..3d126a2 Binary files /dev/null and b/evidence/browser-cockpit-quality/2026-06-11T14-18-20-197Z/05-evidence-ready.png differ diff --git a/evidence/browser-cockpit-quality/2026-06-11T14-18-20-197Z/06-pr-blocked.png b/evidence/browser-cockpit-quality/2026-06-11T14-18-20-197Z/06-pr-blocked.png new file mode 100644 index 0000000..ff9f2c1 Binary files /dev/null and b/evidence/browser-cockpit-quality/2026-06-11T14-18-20-197Z/06-pr-blocked.png differ diff --git a/evidence/browser-cockpit-quality/2026-06-11T14-18-20-197Z/console-logs.json b/evidence/browser-cockpit-quality/2026-06-11T14-18-20-197Z/console-logs.json new file mode 100644 index 0000000..0637a08 --- /dev/null +++ b/evidence/browser-cockpit-quality/2026-06-11T14-18-20-197Z/console-logs.json @@ -0,0 +1 @@ +[] \ No newline at end of file diff --git a/evidence/browser-cockpit-quality/2026-06-11T14-18-20-197Z/db-state-summary.json b/evidence/browser-cockpit-quality/2026-06-11T14-18-20-197Z/db-state-summary.json new file mode 100644 index 0000000..fc9dbf8 --- /dev/null +++ b/evidence/browser-cockpit-quality/2026-06-11T14-18-20-197Z/db-state-summary.json @@ -0,0 +1,286 @@ +{ + "mission": { + "id": "01KTVGRSVAV875W32873Y1V9TB", + "status": "paused", + "githubPrUrl": null + }, + "operatorView": { + "stage": "pr_blocked", + "stageLabel": "PR blocked by policy · PR 被安全门拦截", + "confidence": 96, + "progressPercent": 95, + "headlessCallsToday": 0, + "primaryAction": { + "id": "check-draft-pr-gate", + "label": "Re-check Draft PR Gate · 重新检查 PR 安全门", + "kind": "primary" + }, + "secondaryActions": [], + "providerSummary": { + "planner": { + "name": "test-synthetic", + "mode": "mock", + "status": "Planner finished", + "tokens": null + }, + "worker": { + "name": "mock", + "mode": "mock", + "status": "done", + "tokens": null + }, + "validators": [ + { + "name": "gemini", + "mode": "not_configured", + "status": "not_configured" + } + ] + }, + "safetySummary": { + "remoteWrites": "disabled", + "prGate": { + "status": "blocked", + "code": "GEMINI_NOT_CONFIGURED", + "reason": "Gemini hard gate has no evidence-only PASS verdict for this mission.", + "remediation": "Remote writes are disabled for safety. Enable repo-scoped allow_remote_writes only when you want the worker to push a branch and open a Draft PR; until then no push, PR, or merge occurs." + }, + "testMode": { + "enabled": true, + "reason": "mock/template mode is active; no external model or remote write is implied." + } + }, + "understanding": { + "roundsCompleted": 0, + "questions": [], + "readyReason": "Planner confidence is at least 95% and no clarification questions are pending." + }, + "projectPulse": { + "progress": [ + { + "id": "understand", + "label": "Understand · 理解需求", + "status": "done" + }, + { + "id": "roadmap", + "label": "Roadmap · 路线图", + "status": "done" + }, + { + "id": "execute", + "label": "Execute · 本地执行", + "status": "done" + }, + { + "id": "validate", + "label": "Validate · 独立验证", + "status": "done", + "detail": "Gemini key is not configured; this is visible and not counted as pass." + }, + { + "id": "pr-gate", + "label": "PR Gate · PR 安全门", + "status": "active" + }, + { + "id": "learn", + "label": "Learn · 沉淀记忆", + "status": "pending" + } + ], + "workingFolder": "/tmp/aedev-cockpit-quality-fdnIyj/operator-evidence/01KTVGRT7ZVA75Y4FFM78SZDFT", + "touchedFiles": [], + "evidence": [ + { + "id": "01KTVGRT87FKE8J3G30RD344SS", + "title": "ADR draft", + "path": "/tmp/aedev-cockpit-quality-fdnIyj/evidence/01KTVGRSVAV875W32873Y1V9TB/adr-mission.md", + "type": "adr" + }, + { + "id": "01KTVGRT87FKE8J3G30RD344SR", + "title": "PRD", + "path": "/tmp/aedev-cockpit-quality-fdnIyj/evidence/01KTVGRSVAV875W32873Y1V9TB/prd.md", + "type": "prd" + }, + { + "id": "01KTVGRT87FKE8J3G30RD344SV", + "title": "Workbook summary", + "path": "/tmp/aedev-cockpit-quality-fdnIyj/evidence/01KTVGRSVAV875W32873Y1V9TB/workbook-summary.md", + "type": "report" + }, + { + "id": "01KTVGRT87FKE8J3G30RD344SW", + "title": "Test summary", + "path": "/tmp/aedev-cockpit-quality-fdnIyj/evidence/01KTVGRSVAV875W32873Y1V9TB/test-summary.md", + "type": "report" + }, + { + "id": "01KTVGRT87FKE8J3G30RD344SX", + "title": "Risk report", + "path": "/tmp/aedev-cockpit-quality-fdnIyj/evidence/01KTVGRSVAV875W32873Y1V9TB/risk-report.md", + "type": "report" + }, + { + "id": "01KTVGRT87FKE8J3G30RD344SY", + "title": "Worker diff", + "path": "/tmp/aedev-cockpit-quality-fdnIyj/evidence/01KTVGRSVAV875W32873Y1V9TB/diff-summary.md", + "type": "report" + }, + { + "id": "01KTVGRT87FKE8J3G30RD344SZ", + "title": "Done report", + "path": "/tmp/aedev-cockpit-quality-fdnIyj/evidence/01KTVGRSVAV875W32873Y1V9TB/done-report.md", + "type": "report" + }, + { + "id": "01KTVGRT87FKE8J3G30RD344ST", + "title": "Roadmap", + "path": "/tmp/aedev-cockpit-quality-fdnIyj/evidence/01KTVGRSVAV875W32873Y1V9TB/roadmap.md", + "type": "roadmap" + }, + { + "id": "01KTVGRT86Q9ED0ZKQC89YAFQ9", + "title": "Evidence directory", + "path": "/tmp/aedev-cockpit-quality-fdnIyj/evidence/01KTVGRSVAV875W32873Y1V9TB", + "type": "evidence" + }, + { + "id": "01KTVGRSVF99QASQAR74F7M9TM", + "title": "ADR draft in mission design", + "path": "/tmp/aedev-cockpit-quality-fdnIyj/prd/01KTVGRSVAV875W32873Y1V9TB.design.json", + "type": "adr" + }, + { + "id": "01KTVGRSVF99QASQAR74F7M9TJ", + "title": "PRD", + "path": "/tmp/aedev-cockpit-quality-fdnIyj/prd/01KTVGRSVAV875W32873Y1V9TB.md", + "type": "prd" + }, + { + "id": "01KTVGRSVF99QASQAR74F7M9TK", + "title": "Mission design JSON", + "path": "/tmp/aedev-cockpit-quality-fdnIyj/prd/01KTVGRSVAV875W32873Y1V9TB.design.json", + "type": "roadmap" + } + ], + "validatorReviews": [ + { + "id": "validators-not-configured", + "validator": "validators", + "verdict": "not_configured", + "summary": "Independent validation did not run because the Gemini key is not configured.", + "checkedEvidence": [ + "ADR draft", + "PRD", + "Workbook summary", + "Test summary", + "Risk report", + "Worker diff", + "Done report", + "Roadmap" + ], + "blockingIssues": [], + "evidenceGaps": [ + "No Gemini validator verdict exists for this mission." + ], + "recommendedNextAction": "Configure validator keys for live verification, or continue reviewing evidence manually." + } + ] + }, + "memorySummary": { + "projectFacts": [ + { + "id": "repo-01KTVGRP7HR0FRVQKD8Z09W5T9", + "kind": "project", + "text": "Target repo is cockpit-quality at /tmp/aedev-cockpit-quality-fdnIyj.", + "provenance": "repo registry", + "ttlDays": 90, + "superseded": false + }, + { + "id": "repo-forbidden-01KTVGRP7HR0FRVQKD8Z09W5T9", + "kind": "safety", + "text": "Forbidden paths stay protected: .env*, secrets/**, .github/**, AGENTS.md", + "provenance": "repo policy", + "ttlDays": 365, + "superseded": false + } + ], + "userPreferences": [ + { + "id": "pref-understand-first", + "kind": "user_preference", + "text": "Ask goal-specific questions and confirm understanding before starting worker execution.", + "provenance": "operator product directive", + "ttlDays": 365, + "superseded": false + }, + { + "id": "prompt-01KTVGRQDTE6MRBPQC16DDA03T", + "kind": "mission_intent", + "text": "Current mission intent: In the dashboard Cockpit page, verify the existing conversation UI quality smoke keeps the single conversation layout, status strip, and safe Draft PR gate visible without changing product behavior. Acceptance: browser smoke passes and evid", + "provenance": "operator prompt", + "ttlDays": 30, + "superseded": false + } + ], + "recentLessons": [ + { + "id": "lesson-0", + "kind": "run_lesson", + "text": "Draft PR blocked: GEMINI_NOT_CONFIGURED", + "provenance": "event:operator.draft_pr_blocked", + "ttlDays": 30, + "superseded": false + } + ] + }, + "summary": "Worker done, evidence ready, and the Draft PR gate was blocked by policy. No branch push, PR, or merge occurred.", + "nextAction": "Continue reviewing evidence, or explicitly enable repo-scoped remote writes before re-checking the gate.", + "testMode": true, + "userState": { + "state": "blocked", + "label": "Needs your attention", + "labelZh": "需要你处理", + "explanation": "系统在这一步暂停,等你看一眼后再继续 · The system paused here and will continue once you take a look." + }, + "lastActivity": { + "atIso": "2026-06-11T14:18:26.209Z", + "agoMs": 204, + "phase": "blocked" + }, + "loopSummary": { + "whatChanged": [], + "testsRan": [ + "Test summary" + ], + "agents": [ + "planner · test-synthetic", + "worker · mock", + "validator · gemini" + ], + "validatorSaid": null, + "whyStoppedOrContinuing": "系统在这一步暂停,等你看一眼后再继续 · The system paused here and will continue once you take a look." + }, + "card": { + "type": "blocker", + "title": "需要你处理 · Needs your attention", + "human_explanation": "系统在这一步暂停,等你看一眼后再继续 · The system paused here and will continue once you take a look.", + "why_it_matters": "在不确定的时候暂停,比悄悄做错更安全;没有你的确认,任何东西都不会对外发布 · Pausing when unsure is safer than quietly doing the wrong thing; nothing is published without your confirmation.", + "recovery_actions": [ + "查看这张卡的说明,确认是否继续 · Read this card’s explanation and confirm whether to continue.", + "随时可以重新开始或调整目标 · You can restart or adjust the goal at any time." + ], + "recommended_action": "查看这张卡的说明,确认是否继续 · Read this card’s explanation and confirm whether to continue.", + "next_step": "查看这张卡的说明,确认是否继续 · Read this card’s explanation and confirm whether to continue.", + "machine": { + "user_state": "blocked", + "stage": "pr_blocked", + "hold_code": null, + "pr_gate_code": "GEMINI_NOT_CONFIGURED" + } + } + } +} \ No newline at end of file diff --git a/evidence/browser-cockpit-quality/2026-06-11T14-18-20-197Z/dom-state-summary.json b/evidence/browser-cockpit-quality/2026-06-11T14-18-20-197Z/dom-state-summary.json new file mode 100644 index 0000000..6e84ddc --- /dev/null +++ b/evidence/browser-cockpit-quality/2026-06-11T14-18-20-197Z/dom-state-summary.json @@ -0,0 +1,6 @@ +{ + "stage": "pr_blocked", + "planner": "mock", + "worker": "mock", + "prGateCode": "GEMINI_NOT_CONFIGURED" +} \ No newline at end of file diff --git a/evidence/browser-cockpit-quality/2026-06-11T14-18-20-197Z/event-tail.json b/evidence/browser-cockpit-quality/2026-06-11T14-18-20-197Z/event-tail.json new file mode 100644 index 0000000..03373c2 --- /dev/null +++ b/evidence/browser-cockpit-quality/2026-06-11T14-18-20-197Z/event-tail.json @@ -0,0 +1,119 @@ +[ + { + "type": "operator.draft_pr_blocked", + "payload": { + "code": "GEMINI_NOT_CONFIGURED", + "reason": "Gemini hard gate has no evidence-only PASS verdict for this mission.", + "validator": "gemini" + }, + "createdAt": "2026-06-11T14:18:26.209Z" + }, + { + "type": "operator.gemini_pr_blocked", + "payload": { + "code": "GEMINI_NOT_CONFIGURED", + "reason": "Gemini hard gate has no evidence-only PASS verdict for this mission.", + "verdict": "not_configured", + "summary": null + }, + "createdAt": "2026-06-11T14:18:26.209Z" + }, + { + "type": "operator.evidence_written", + "payload": { + "sessionId": "01KTVGRQDTE6MRBPQC16DDA03T", + "evidenceDir": "/tmp/aedev-cockpit-quality-fdnIyj/evidence/01KTVGRSVAV875W32873Y1V9TB" + }, + "createdAt": "2026-06-11T14:18:24.391Z" + }, + { + "type": "operator.stage_changed", + "payload": { + "stage": "PR/Waiting/Blocked", + "sessionId": "01KTVGRQDTE6MRBPQC16DDA03T", + "status": "waiting" + }, + "createdAt": "2026-06-11T14:18:24.391Z" + }, + { + "type": "mission.run_completed", + "payload": { + "taskId": "01KTVGRT7ZVA75Y4FFM78SZDFT", + "runId": "01KTVGRT80PGN11T7XNV9BT487", + "exitCode": 0, + "status": "waiting", + "decision": "WAITING", + "riskScore": 0, + "validatorCount": 0, + "releaseDeployUrl": null, + "releaseReverted": false, + "draftPrUrl": null, + "draftPrNumber": null + }, + "createdAt": "2026-06-11T14:18:24.390Z" + }, + { + "type": "operator.worker_started", + "payload": { + "taskId": "01KTVGRT7ZVA75Y4FFM78SZDFT", + "runId": "01KTVGRT80PGN11T7XNV9BT487", + "provider": "mock", + "evidenceDir": "/tmp/aedev-cockpit-quality-fdnIyj/operator-evidence/01KTVGRT7ZVA75Y4FFM78SZDFT" + }, + "createdAt": "2026-06-11T14:18:24.384Z" + }, + { + "type": "operator.worker_log", + "payload": { + "taskId": "01KTVGRT7ZVA75Y4FFM78SZDFT", + "runId": "01KTVGRT80PGN11T7XNV9BT487", + "stream": "stdout", + "chunk": "mock worker completed evidence gate" + }, + "createdAt": "2026-06-11T14:18:24.384Z" + }, + { + "type": "mission.route_selected", + "payload": { + "role": "coder", + "provider": "mock", + "sessionId": null, + "concurrency": 1, + "holdCode": null, + "reason": "worker router not configured" + }, + "createdAt": "2026-06-11T14:18:24.383Z" + }, + { + "type": "mission.run_started", + "payload": { + "evidenceDir": "/tmp/aedev-cockpit-quality-fdnIyj/evidence/01KTVGRSVAV875W32873Y1V9TB" + }, + "createdAt": "2026-06-11T14:18:24.378Z" + }, + { + "type": "operator.worker_assigned", + "payload": { + "sessionId": "01KTVGRQDTE6MRBPQC16DDA03T", + "mode": "mock", + "availableSessions": 0, + "paidApiKeysStripped": true + }, + "createdAt": "2026-06-11T14:18:24.377Z" + }, + { + "type": "operator.run_starting", + "payload": { + "sessionId": "01KTVGRQDTE6MRBPQC16DDA03T" + }, + "createdAt": "2026-06-11T14:18:24.376Z" + }, + { + "type": "operator.stage_changed", + "payload": { + "stage": "Worker", + "sessionId": "01KTVGRQDTE6MRBPQC16DDA03T" + }, + "createdAt": "2026-06-11T14:18:24.376Z" + } +] \ No newline at end of file diff --git a/evidence/browser-cockpit-quality/2026-06-11T14-18-20-197Z/quality-smoke.md b/evidence/browser-cockpit-quality/2026-06-11T14-18-20-197Z/quality-smoke.md new file mode 100644 index 0000000..a12a2e8 --- /dev/null +++ b/evidence/browser-cockpit-quality/2026-06-11T14-18-20-197Z/quality-smoke.md @@ -0,0 +1,16 @@ +# Operator Cockpit WebUI Quality Smoke + +Result: PASS +Mission: 01KTVGRSVAV875W32873Y1V9TB +Stage: pr_blocked +PR gate: GEMINI_NOT_CONFIGURED + +Assertions: +- cockpit renders as one conversation column plus the three-part status strip +- legacy Project Pulse, sidebar, inspector, and tabbed panels are absent +- one primary action per stage +- stable testids for core controls +- planner/worker provider badges expose mock test mode +- PR URL stayed empty while Gemini hard gate was not configured +- draft PR blocked card reassures no push, PR, or merge occurred +- browser console had no error/warning \ No newline at end of file diff --git a/evidence/browser-cockpit-user-e2e/2026-06-11T14-18-36-321Z/01-composed-and-started.png b/evidence/browser-cockpit-user-e2e/2026-06-11T14-18-36-321Z/01-composed-and-started.png new file mode 100644 index 0000000..19f1702 Binary files /dev/null and b/evidence/browser-cockpit-user-e2e/2026-06-11T14-18-36-321Z/01-composed-and-started.png differ diff --git a/evidence/browser-cockpit-user-e2e/2026-06-11T14-18-36-321Z/02-planning-progress.png b/evidence/browser-cockpit-user-e2e/2026-06-11T14-18-36-321Z/02-planning-progress.png new file mode 100644 index 0000000..3e715bf Binary files /dev/null and b/evidence/browser-cockpit-user-e2e/2026-06-11T14-18-36-321Z/02-planning-progress.png differ diff --git a/evidence/browser-cockpit-user-e2e/2026-06-11T14-18-36-321Z/03a-clarify-popup-filled.png b/evidence/browser-cockpit-user-e2e/2026-06-11T14-18-36-321Z/03a-clarify-popup-filled.png new file mode 100644 index 0000000..16a2d8b Binary files /dev/null and b/evidence/browser-cockpit-user-e2e/2026-06-11T14-18-36-321Z/03a-clarify-popup-filled.png differ diff --git a/evidence/browser-cockpit-user-e2e/2026-06-11T14-18-36-321Z/03b-clarify-answered-gate-guidance.png b/evidence/browser-cockpit-user-e2e/2026-06-11T14-18-36-321Z/03b-clarify-answered-gate-guidance.png new file mode 100644 index 0000000..7439e0f Binary files /dev/null and b/evidence/browser-cockpit-user-e2e/2026-06-11T14-18-36-321Z/03b-clarify-answered-gate-guidance.png differ diff --git a/evidence/browser-cockpit-user-e2e/2026-06-11T14-18-36-321Z/03c-clarify-unlocked.png b/evidence/browser-cockpit-user-e2e/2026-06-11T14-18-36-321Z/03c-clarify-unlocked.png new file mode 100644 index 0000000..3569565 Binary files /dev/null and b/evidence/browser-cockpit-user-e2e/2026-06-11T14-18-36-321Z/03c-clarify-unlocked.png differ diff --git a/evidence/browser-cockpit-user-e2e/2026-06-11T14-18-36-321Z/04-roadmap-ready.png b/evidence/browser-cockpit-user-e2e/2026-06-11T14-18-36-321Z/04-roadmap-ready.png new file mode 100644 index 0000000..a9bd910 Binary files /dev/null and b/evidence/browser-cockpit-user-e2e/2026-06-11T14-18-36-321Z/04-roadmap-ready.png differ diff --git a/evidence/browser-cockpit-user-e2e/2026-06-11T14-18-36-321Z/05a-approved.png b/evidence/browser-cockpit-user-e2e/2026-06-11T14-18-36-321Z/05a-approved.png new file mode 100644 index 0000000..6be20d8 Binary files /dev/null and b/evidence/browser-cockpit-user-e2e/2026-06-11T14-18-36-321Z/05a-approved.png differ diff --git a/evidence/browser-cockpit-user-e2e/2026-06-11T14-18-36-321Z/05b-execution-evidence-gate.png b/evidence/browser-cockpit-user-e2e/2026-06-11T14-18-36-321Z/05b-execution-evidence-gate.png new file mode 100644 index 0000000..e4794d8 Binary files /dev/null and b/evidence/browser-cockpit-user-e2e/2026-06-11T14-18-36-321Z/05b-execution-evidence-gate.png differ diff --git a/evidence/browser-cockpit-user-e2e/2026-06-11T14-18-36-321Z/06-loop-summary.png b/evidence/browser-cockpit-user-e2e/2026-06-11T14-18-36-321Z/06-loop-summary.png new file mode 100644 index 0000000..e4794d8 Binary files /dev/null and b/evidence/browser-cockpit-user-e2e/2026-06-11T14-18-36-321Z/06-loop-summary.png differ diff --git a/evidence/browser-cockpit-user-e2e/2026-06-11T14-18-36-321Z/07-pr-gate-blocked-human.png b/evidence/browser-cockpit-user-e2e/2026-06-11T14-18-36-321Z/07-pr-gate-blocked-human.png new file mode 100644 index 0000000..ff9f2c1 Binary files /dev/null and b/evidence/browser-cockpit-user-e2e/2026-06-11T14-18-36-321Z/07-pr-gate-blocked-human.png differ diff --git a/evidence/browser-cockpit-user-e2e/2026-06-11T14-18-36-321Z/console-logs.json b/evidence/browser-cockpit-user-e2e/2026-06-11T14-18-36-321Z/console-logs.json new file mode 100644 index 0000000..2936d7c --- /dev/null +++ b/evidence/browser-cockpit-user-e2e/2026-06-11T14-18-36-321Z/console-logs.json @@ -0,0 +1,3 @@ +[ + "error: Failed to load resource: the server responded with a status of 409 (Conflict)" +] \ No newline at end of file diff --git a/evidence/browser-cockpit-user-e2e/2026-06-11T14-18-36-321Z/user-e2e-report.md b/evidence/browser-cockpit-user-e2e/2026-06-11T14-18-36-321Z/user-e2e-report.md new file mode 100644 index 0000000..a0e37bb --- /dev/null +++ b/evidence/browser-cockpit-user-e2e/2026-06-11T14-18-36-321Z/user-e2e-report.md @@ -0,0 +1,79 @@ +# Operator Cockpit — User Journey E2E Report + +Result: **PASS** +Timestamp: 2026-06-11T14-18-36-321Z +Evidence dir: /home/user/claude-code-247/evidence/browser-cockpit-user-e2e/2026-06-11T14-18-36-321Z + +Harness: mock/template planner+worker, remote writes disabled, all external CLIs/APIs disabled, +temp stateDir, in-memory SQLite, vite dashboard, chromium via playwright. + +## Steps + +### step-1-compose-and-start — PASS + +Type a user prompt into the composer and start brainstorm +- composer testid: cockpit-goal-input · prompt: Make the onboarding flow friendlier for new users. I want it… +- screenshot: /home/user/claude-code-247/evidence/browser-cockpit-user-e2e/2026-06-11T14-18-36-321Z/01-composed-and-started.png + +### step-2-visible-progress — PASS + +Planning shows visible progress; the UI never looks frozen +- status strip during planning: STAGE Brainstorm · 共创中 NOW Planner is thinking · Planner 正在分析 PROGRESS 0% — — APPROVALS 0 +- loop card during planning/clarify: type=understanding · active-agent=claude · next_step="回答下方的待确认问题,AI 才能继续生成方案 · Answer the questions below so the plan can continue." +- strip refreshed: "STAGE Brainstorm · 共创中 NOW Planner is thinking · Planner 正在分析 PROGRESS 0% — — APPROVALS 0" → "STAGE Decision · 做选择 NOW Review the questions, then generate the plan · 先确认问题,再生成方案 PROGRESS 0% — — APPROVALS 0" +- cockpit-last-activity refresh check is completed as soon as the mission overview exists (see step 4 notes) — the testid only renders once a mission is created. +- cockpit-last-activity refresh verified: "LAST ACTIVITY 0s ago" → "LAST ACTIVITY 1s ago" (1.7s apart) +- screenshot: /home/user/claude-code-247/evidence/browser-cockpit-user-e2e/2026-06-11T14-18-36-321Z/02-planning-progress.png + +### step-3-clarifications — PASS + +Answer the clarification popup through the real UI controls +- clarification questions rendered: 2 +- answered transcript message visible; popup dismissed +- locked Generate Plan produced calm guidance, no raw gate code in visible text +- follow-up round confirmed confidence ≥95; plan unlocked +- loop card after clarification answers: type=understanding · active-agent=claude · next_step="稍等片刻,AI 正在确认理解,随后会给出方案 · Hang on — understanding is being confirmed; a plan comes next." +- screenshot: /home/user/claude-code-247/evidence/browser-cockpit-user-e2e/2026-06-11T14-18-36-321Z/03a-clarify-popup-filled.png +- screenshot: /home/user/claude-code-247/evidence/browser-cockpit-user-e2e/2026-06-11T14-18-36-321Z/03b-clarify-answered-gate-guidance.png +- screenshot: /home/user/claude-code-247/evidence/browser-cockpit-user-e2e/2026-06-11T14-18-36-321Z/03c-clarify-unlocked.png + +### step-4-generate-roadmap — PASS + +Generate roadmap; PRD/roadmap artifacts exist and stage advances +- mission 01KTVGSCF2AEEK716SV34MAHVX created with 3 design artifacts (adr, prd, roadmap…) +- loop card at roadmap_ready: type=plan · active-agent=claude · next_step="审阅这份方案;你批准后才会开始动手 · Review this plan; work starts only after you approve it." +- card action on the plan card: approve-roadmap · "Approve Roadmap · 批准路线" +- cockpit-last-activity refresh verified: "LAST ACTIVITY 0s ago" → "LAST ACTIVITY 1s ago" +- screenshot: /home/user/claude-code-247/evidence/browser-cockpit-user-e2e/2026-06-11T14-18-36-321Z/04-roadmap-ready.png + +### step-5-approve-and-execute — PASS + +Approve roadmap, start execution; execution state appears +- card action on the approved card: start-execution · "Start Execution · 启动执行" +- execution state appeared (stage=running) +- loop card during execution: type=progress · active-agent=codex · next_step="Wait for progress, pause, or stop if the run is wrong." +- worker runs recorded: 1; final stage=validators_missing +- screenshot: /home/user/claude-code-247/evidence/browser-cockpit-user-e2e/2026-06-11T14-18-36-321Z/05a-approved.png +- screenshot: /home/user/claude-code-247/evidence/browser-cockpit-user-e2e/2026-06-11T14-18-36-321Z/05b-execution-evidence-gate.png + +### step-6-loop-summary — PASS + +cockpit-loop-summary renders with non-empty whyStoppedOrContinuing +- whyStoppedOrContinuing: 结果评审尚未配置 · result review not configured +- screenshot: /home/user/claude-code-247/evidence/browser-cockpit-user-e2e/2026-06-11T14-18-36-321Z/06-loop-summary.png + +### step-7-draft-pr-gate — PASS + +Draft PR gate BLOCKED is calm human text; no raw codes visible +- machine code stays in data-* only: data-pr-gate-code=GEMINI_NOT_CONFIGURED +- calm safety phrasing visible (安全门 / no push, no PR, no merge reassurance) +- loop card at the Draft PR gate: type=blocker · active-agent=none · next_step="查看这张卡的说明,确认是否继续 · Read this card’s explanation and confirm whether to continue." +- no PR URL recorded; operator.draft_pr_blocked event present +- screenshot: /home/user/claude-code-247/evidence/browser-cockpit-user-e2e/2026-06-11T14-18-36-321Z/07-pr-gate-blocked-human.png + +## Browser console issues (informational) + +- error: Failed to load resource: the server responded with a status of 409 (Conflict) + +> Note: the deliberate locked Generate Plan probe in step 3 produces one expected 409 network log entry; +> the assertion is that the VISIBLE UI stays human (guidance text, no raw codes). diff --git a/scripts/operator-cockpit-e2e.ts b/scripts/operator-cockpit-e2e.ts index 6f5c789..a9917a6 100644 --- a/scripts/operator-cockpit-e2e.ts +++ b/scripts/operator-cockpit-e2e.ts @@ -17,6 +17,50 @@ process.env['AEDEV_DISABLE_CODEX_CLI'] = '1' process.env['AEDEV_DISABLE_GEMINI_API'] = '1' process.env['AEDEV_DISABLE_OPENAI_API'] = '1' +// Deterministic adaptive-clarify journey (current product contract, GR#11 / +// conversation-first + ClarificationPopup): the brainstorm round asks two +// questions at confidence 62 (plan locked); the follow-up round after the +// operator answers reaches confidence 96 (plan unlocked). Same fenced-JSON +// fixture mechanism as scripts/operator-cockpit-user-e2e.ts — no live CLI. +process.env['AEDEV_COCKPIT_PLANNER_BRAINSTORM_FIXTURE_TEXT'] = [ + 'Initial brainstorm: deterministic e2e fixture round.', + '', + '- The goal needs two confirmations before a safe plan.', + '- Nothing has been changed yet — planning only.', + '', + '```json', + JSON.stringify({ + questions: [ + { + field: 'scope', + question: 'How tightly should this mission be scoped?', + why: 'Scoping avoids an unbounded first mission.', + impact: 'It decides the roadmap and the worker prompt.', + destination: 'PRD/Roadmap', + options: [{ label: 'Smallest viable change', recommended: true }, { label: 'A broader multi-file pass' }], + }, + { + field: 'acceptance-criteria', + question: 'What is the most important observable acceptance criterion?', + why: 'A verifiable acceptance criterion gates execution.', + impact: 'It defines the evidence the validator reads.', + destination: 'PRD/Roadmap', + options: [{ label: 'A specific test/command must pass', recommended: true }, { label: 'A named UI behavior must change' }], + }, + ], + confidence: 62, + rationale: 'Two answers are still needed before a safe roadmap.', + }), + '```', +].join('\n') +process.env['AEDEV_COCKPIT_PLANNER_FOLLOWUP_FIXTURE_TEXT'] = [ + 'Thanks — your answers are enough to plan safely. · 你的回答已足够,可以生成方案了。', + '', + '```json', + JSON.stringify({ questions: [], confidence: 96, rationale: 'Operator answers resolved scope and acceptance.' }), + '```', +].join('\n') + const stateDir = mkdtempSync(join(tmpdir(), 'aedev-cockpit-e2e-')) const db = new AedevDb(':memory:') let dashboard: ChildProcess | undefined @@ -58,8 +102,27 @@ try { await page.goto(`http://127.0.0.1:${DASHBOARD_PORT}`, { waitUntil: 'domcontentloaded' }) await page.getByTestId('cockpit-start-brainstorm').click() await page.getByText('Initial brainstorm:', { exact: false }).waitFor({ timeout: 10_000 }) - await page.getByRole('button', { name: 'A specific test/command must pass ★' }).click() - await page.getByRole('button', { name: 'Answer all & continue · 全部确认并继续' }).click() + // Clarification answering moved into the bottom-anchored ClarificationPopup + // (conversation-first + five-card surface is the default; WORKBOOK_v6 GR#11). + // Same selector contract as scripts/operator-cockpit-user-e2e.ts: pick the + // recommended chip per question, then submit through the popup. + const popup = page.locator('.ck-clar-popup') + await popup.waitFor({ timeout: 20_000 }) + const questions = popup.locator('.ck-clar-q') + const count = await questions.count() + if (count < 1) throw new Error('Clarification popup rendered with no questions') + for (let i = 0; i < count; i++) { + const recommended = questions.nth(i).locator('.ck-chip.recommended') + if (await recommended.count()) await recommended.first().click() + else await questions.nth(i).locator('.ck-chip').first().click() + } + await popup.getByRole('button', { name: /Answer all/ }).click() + await page.getByText('已确认 · Clarifications', { exact: false }).first().waitFor({ timeout: 15_000 }) + await popup.waitFor({ state: 'detached', timeout: 15_000 }) + // The confidence gate stays locked (62 < 95) until the planner re-checks with + // the answers folded in — Ask Until Clear runs the follow-up round (96). + await page.getByRole('button', { name: /Ask Until Clear/ }).click() + await page.getByText('your answers are enough to plan safely', { exact: false }).first().waitFor({ timeout: 20_000 }) await page.getByTestId('cockpit-generate-plan-primary').click() await page.getByTestId('mission-stage').waitFor({ timeout: 10_000 }) await page.getByTestId('cockpit-approve-roadmap').click() @@ -67,8 +130,11 @@ try { await waitForRootStage(page, ['validators_missing', 'evidence_ready', 'pr_ready']) await page.getByTestId('cockpit-check-draft-pr-gate').click() await page.getByTestId('cockpit-pr-gate-card').waitFor({ timeout: 10_000 }) + // The Gemini evidence-only hard gate is evaluated BEFORE the remote-writes + // gate; with no Gemini verdict configured the deterministic blocked code is + // GEMINI_NOT_CONFIGURED (same contract the quality smoke asserts). const code = await page.getByTestId('cockpit-pr-gate-card').getAttribute('data-pr-gate-code') - if (code !== 'REMOTE_WRITES_DISABLED') throw new Error(`Expected REMOTE_WRITES_DISABLED, got ${code}`) + if (code !== 'GEMINI_NOT_CONFIGURED') throw new Error(`Expected GEMINI_NOT_CONFIGURED, got ${code}`) await browser.close() browser = undefined