This document is the canonical instruction set for document-driven software delivery using raw terminal sessions.
For the higher-level end-to-end product lifecycle, read docs/build-anything-workflow.md.
The machine-readable source of truth lives in docs/canonical-workflow.json, which currently defines 5 phases and 22 stages.
This prose file is the human-readable explanation of that execution model.
It exists to minimize drift, ambiguity, and low-quality execution by separating:
- planning from implementation
- review from coding
- durable documents from machine state
This workflow is intentionally tool-light:
- terminals only
- documents only
- git repos only
- skills when useful
No external orchestrator is required.
Current role mapping:
human: owns goals, priorities, tradeoffs, approvals, and final decisionsCodex: planner, spec writer, reviewer, and process controllerClaude Code: coder and revision executorGemini: not used in the default v1 flow
Current environment assumptions:
- work happens from raw terminal sessions
- all collaboration is recorded in documents
- machine lifecycle state may live outside markdown in
system/state.json - task execution may use scripts and skills, but documents remain the collaboration surface
Out of scope for this workflow:
- Conductor
- Agent of Empires
- AO / Composio Agent Orchestrator
- Paperclip
- automatic PR comment sync
- CI-driven autonomous retries
Use scripts whenever the task can be deterministic.
Why:
- deterministic behavior
- no token cost
- easier verification
Typical script steps:
- create folders
- generate diffs
- run tests
- run lint
- validate YAML or JSON
- update
system/state.json
Use an LLM step only when the task requires interpretation, synthesis, design, review, or writing that cannot be made purely deterministic.
Typical LLM steps:
- task classification
- scope estimation
- PRD drafting
- user-flow drafting
- implementation-plan drafting
- code review
- revision planning
Require explicit human approval when:
- PRD and user flow need product sign-off
- scope expands beyond the approved requirement
- the workflow touches secrets, private keys, or production credentials
- the workflow attempts destructive actions
- the workflow fails to converge after bounded retries
Documents hold:
- intent
- specifications
- plans
- handoffs
- reviews
- approvals
- final conclusions
Machine state files hold:
- lifecycle counters
- current stage
- lock state
- stop reason
Do not rewrite history.
Every agent-to-agent or human-to-agent handoff creates a new document. Old handoffs remain read-only.
This repository now has two valid workflow views:
- Build Anything workflow
The full product lifecycle:
Research -> Design -> Development -> Packaging -> Maintenance - Canonical workflow The currently enforced 22-stage execution model used by scripts and the dashboard
The Build Anything workflow is the strategic model. The canonical workflow is the operational model.
Recommended left-to-right lifecycle:
flowchart LR
A[Research] --> B[Design]
B --> C[Development]
C --> D[Packaging]
D --> E[Maintenance]
| Phase | Required result |
|---|---|
| Research | A concrete anchor on real products, with screenshot evidence and a short recommendation brief |
| Design | Explicit documents that define product intent, user flow, approval state, and execution plan |
| Development | Working code plus bounded execution reports, review reports, and final revision evidence |
| Packaging | Clean integration, final verification, merge readiness, and delivery-facing artifacts |
| Maintenance | Captured debt, learnings, follow-ups, and the next-cycle candidate |
| Build Anything phase | Canonical workflow mapping |
|---|---|
| Research | Stages 0-4 |
| Design | Stages 5-12 |
| Development | Stages 13-16 |
| Packaging | Stages 17-19 |
| Maintenance | Stages 20-21 |
Important:
- Research is now a first-class canonical phase with anchor research and evidence collection stages
- Packaging and Maintenance are now modeled as distinct canonical phases rather than being collapsed into cleanup
The canonical lifecycle is:
- Clarify objective
- Classify task and estimate size
- Run product research
- Collect reference evidence
- Research approval gate
- Draft PRD
- Review PRD against the real codebase or environment
- Draft user flow
- Draft prototype brief
- Design approval gate
- Draft implementation plan
- Review implementation plan
- Write execution prompt for Claude Code
- Claude Code executes in batches
- Codex reviews each batch
- Gate each major phase
- Claude Code performs final revision
- Integrate and verify
- Prepare release package
- Delivery approval gate
- Capture the next cycle
- Update backlog and debt
Default delivery model by task size:
small: may use a simplified two-review flowmedium: should use batch execution and explicit phase gateslarge: must use batch execution, phase gates, and more detailed PRD and plan artifacts
Every task must be classified before PRD drafting.
Use exactly one primary type:
featurenew_projectbug_fixoptimization
Use exactly one size:
smallmediumlarge
small: lean PRD, lean plan, simplified review loop allowedmedium: standard PRD, standard plan, batch execution recommendedlarge: full PRD, explicit user flow, explicit phase gates, stronger review discipline
Use these skills when available:
- task estimation: agent estimation skill, if available
- PRD drafting: Ralph PRD skill
- user-flow design: agent-canvas, if available
- implementation plan:
superpowers:writing-plans
Guidance:
- estimation output belongs to scoping, not inside the PRD core body
- user flow belongs to the PRD stage and must be approved before implementation planning
- implementation planning starts only after approved PRD and approved user flow exist
- if a recommended skill is available for the current stage, the agent should use it or explicitly explain why it is being skipped
Owner: Codex
Step type: ai_routing
Goal:
- identify the real unit of work before implementation starts
Required outputs:
- clear objective
- clear success condition
- clear scope boundary
Runtime rules:
- this stage is a distinct interaction stage and must not be silently collapsed into later drafting
- if objective, success condition, or scope boundary are still ambiguous,
Codexmust ask the human focused clarification questions and stop - do not proceed to task classification until the clarification answers are either explicit in the human input or captured through a clarification turn
- stage 0 completes only after the intake artifact reflects clarified human intent rather than self-inferred assumptions
Owner: Codex
Step type: ai_routing
Required outputs:
- task type
- task size
- estimate summary
- rationale for size
Owner: Codex
Step type: ai_routing
Required outputs:
- primary anchor
- secondary anchor
- similarity note
- recommendation brief
Owner: Codex
Step type: ai_routing
Required outputs:
- key page screenshots
- key flow screenshots
- source links
Owner: human
Step type: human_approval_gate
Approval is required for:
- primary and secondary anchors
- evidence set quality
- the recommendation direction that will feed design
Owner: Codex
Step type: ai_routing
Required PRD contents:
- purpose
- scope
- non-goals
- contracts
- expected behavior
- acceptance criteria
- constraints
- terminology
Owner: Codex
Step type: ai_routing plus script-assisted repo inspection
Purpose:
- correct mismatches between the drafted PRD and the real repo or environment
Required outputs:
- corrected PRD
- contradiction list resolved
- explicit baseline assumptions
Owner: Codex
Step type: ai_routing
This step must produce both:
- a human-readable user flow
- a structured YAML version
Structured YAML requirements:
- every step must define
inputs - every step must define
outputs - every step must define
validation - every step must define
failure - every step must define
next
Owner: Codex
Step type: ai_routing
This step must produce:
- core screens to emulate or prototype
- key interactions to preserve
- visual direction anchored to the approved research evidence
Owner: human
Step type: human_approval_gate
Approval is required for:
- PRD
- user flow
- prototype brief
Without approval, the workflow must not proceed to implementation planning.
Artifact rules:
handoffs/25-human-approval.mdmay begin as a pending approval request drafted byCodex- the human decision must overwrite the file with the final approval or revision outcome before implementation planning proceeds
- once the gate is pending, machine state should move to
status: waiting
Owner: Codex
Step type: ai_routing
The plan must translate the approved PRD and approved user flow into:
- phases
- batches
- task order
- likely file touchpoints
- verification steps
- stop conditions
Owner: Codex
Step type: ai_routing
Purpose:
- harden the plan until another agent can execute it with minimal interpretation drift
Required outputs:
- reviewed plan
- clarified execution order
- clarified verification commands
- clarified dependency boundaries
Owner: Codex
Step type: ai_routing
This prompt is a formal handoff, not an informal chat message.
It must define:
- repository path
- PRD path
- implementation plan path
- source-of-truth rules
- execution order
- stop conditions
- logging requirements
- report format
- forbidden behaviors
Owner: Claude Code
Step type: ai_routing plus scripts
Rules:
- do not implement the whole task in one unbounded run
- work in small batches
- create durable execution reports
Each batch report must contain:
- tasks completed
- files changed
- tests run
- result
- next proposed batch
Owner: Codex
Step type: ai_routing
Review must cover:
- bugs
- regressions
- missing tests
- contract violations
- architectural drift
- incorrect assumptions
Batch gate decisions:
proceedfix_before_proceedingstop_and_rethink
Owner: Codex, with human escalation when needed
Purpose:
- prevent downstream work before upstream contracts are verified
Typical phase gates:
- backend before UI
- infrastructure before workflow logic
- data model before feature layer
- refactor before product polish
Owner: Claude Code
Rule:
- after the second Codex review in the simplified flow, Claude Code performs one final revision
- that output is treated as the final version for v1 unless human escalation is required
Owners: human and Codex
Step type: script
Required completion work:
- final test verification
- final build verification when applicable
- branch hygiene
- merge readiness check
- documented verification evidence
Delivery is not ready until integration is clean and the verification artifacts exist.
Owners: Codex
Step type: ai_routing
Required completion work:
- release notes or delivery summary
- demo assets or screenshots when applicable
- explicit delivery package contents
Owners: human
Step type: human_approval_gate
Required outputs:
- approval or rejection of the delivery package
- explicit release decision
Owners: human and Codex
Required reflection topics:
- architectural debt
- deferred cleanup
- legacy removal
- product polish
- performance improvement opportunities
- next spec or plan candidate
Owners: Codex
Step type: ai_routing
Required outputs:
- backlog updates
- debt ledger updates
- deferred opportunities captured in durable form
For small tasks, the default bounded loop is:
- Codex writes PRD
- Codex reviews PRD against reality
- Codex writes implementation plan
- Codex reviews implementation plan
- Codex writes execution prompt
- Claude Code implements v1
- Codex review round 1
- Claude Code revision 1
- Codex review round 2
- Claude Code final revision
No third Codex review is required in the default small-task v1 flow.
For medium and large tasks:
- implementation must proceed in batches
- each batch must be reviewed before the next one
- major phase boundaries must be explicitly gated
Do not collapse a medium or large task into a single uninterrupted implementation run unless the human explicitly approves that deviation.
Each live task must have a task directory.
Recommended shape:
tasks/TASK-YYYY-MM-DD-short-name/
status.md
decision-log.md
handoffs/
00-intake.md
05-task-classification.yaml
08-scope-estimate.md
09-product-research.md
09-reference-evidence.md
09-research-approval.md
10-prd.md
15-prd-reality-review.md
20-user-flow.md
21-user-flow.yaml
22-prototype-brief.md
25-human-approval.md
30-implementation-plan.md
32-execution-workflow.yaml
35-plan-review.md
40-execution-prompt.md
50-claude-batch-r1.md
60-codex-review-r1.md
85-phase-gate.md
90-claude-final.md
95-integration-checklist.md
96-release-package.md
97-delivery-approval.md
99-next-cycle.md
100-backlog-and-debt.md
system/
state.json
run-log.jsonl
lock
Purpose:
- current summary only
Must include:
- current stage
- current owner
- current round or batch
- latest conclusion
- blockers
- next step
Purpose:
- record durable decisions only
Typical entries:
- scope approved
- user flow revised
- review gate blocked
- final revision accepted as delivery artifact
Purpose:
- primary collaboration artifacts
Rules:
- every handoff is append-only
- do not overwrite old rounds
- each handoff should be self-contained enough for the next actor
Purpose:
- machine state only
Expected keys:
statusstageroundcurrent_actorlast_artifactstop_reason
Every handoff document should begin with machine-friendly frontmatter.
Example:
---
task_id: TASK-2026-03-15-example
author: codex
role: reviewer
round: 1
inputs:
- handoffs/50-claude-batch-r1.md
status: completed
next_actor: claude
---Required body sections vary by artifact, but should stay explicit and stable.
Every executable step should be expressible in this shape:
id: prd_draft
name: Draft PRD
actor: codex
step_type: ai_routing
inputs:
- handoffs/00-intake.md
outputs:
- handoffs/10-prd.md
validation:
type: schema
schema: prd_v1
failure:
on_validation_error: retry_once
on_semantic_gap: escalate_to_human
next:
- prd_reality_reviewRequired step fields:
idnameactorstep_typeinputsoutputsvalidationfailurenext
Allowed step_type values:
scriptai_routinghuman_approval_gate
The PRD stage must produce:
user-flow.mduser-flow.yaml
This is the human-readable product flow.
It should answer:
- who the user is
- where the user enters
- what the user sees
- what the user does
- how the system responds
- what successful completion looks like
This is the structured representation used for validation and later planning.
Recommended shape:
task_id: TASK-2026-03-15-example
flow_name: example-flow
steps:
- id: entry
name: User enters workflow
actor: user
step_type: script
goal: Capture initial input
inputs:
- raw_request
outputs:
- normalized_request
validation:
type: schema
schema: normalized_request_v1
failure:
on_validation_error: stop_and_revise
next:
- classify_taskImplementation planning should also produce execution-workflow.yaml.
Purpose:
- describe how scripts, LLM steps, and human gates connect during execution
This file is separate from user-flow.yaml.
Why:
user-flow.yamlexpresses product and user logicexecution-workflow.yamlexpresses delivery logic
Do not collapse them into one file.
A human gate should produce a short approval artifact.
Example:
---
task_id: TASK-2026-03-15-example
author: codex
role: planner
gate: prd_user_flow
status: pending
next_actor: human
---Suggested body sections:
DecisionNotesConstraints
After the human decides, update the same file to reflect the final approval state and human ownership.
Required human gates:
- PRD approval
- user-flow approval
Required escalation gates:
- touching secrets or private keys
- destructive operations
- unexpected scope growth
- unresolved contradiction after bounded retries
Any terminal agent using this workflow must obey these rules:
- Read
Development Workflow.mdbefore acting. - Read all prerequisite task documents before acting.
- Advance at most one canonical stage per workflow cycle.
- When entering a stage, write state first, then write artifacts, then log completion or waiting.
- At human gates, stop instead of speculating past approval.
- Do not guess missing context.
- If required inputs are missing, stop and record the blocker.
- Prefer scripts over LLMs.
- Produce a new handoff for every meaningful transition.
- Update
status.mdandsystem/state.jsonafter each stage. - Do not skip validation.
- Do not silently expand scope.
- Stop on human-gate conditions and wait for approval.
Every LLM step must have:
- a prompt contract
- an input contract
- an output contract
- a validation contract
- a failure contract
Minimum validation expectations:
- required file exists
- frontmatter parses
- required sections exist
- YAML parses
- enum fields use allowed values
- review outputs contain explicit decisions
No LLM step should emit freeform text alone when the output must drive later execution.
A review artifact must explicitly state:
- findings
- severity or priority if relevant
- required changes
- optional improvements
- gate decision
Allowed gate decisions:
proceedfix_before_proceedingstop_and_rethink
A Claude Code revision artifact must explicitly state:
- what changed
- what files changed
- what was intentionally left unchanged
- what tests or verification steps ran
- any blocker still present
Before closing a task, confirm:
- required tests have run
- required build checks have run when relevant
- repo state is understandable
- merge readiness is documented
- temporary work artifacts are cleaned up or intentionally preserved
- follow-up items are written down
Stop the workflow when any of these is true:
- approved PRD is missing
- approved user flow is missing
- required inputs are missing
- a high-risk action requires human approval
- a validation failure cannot be resolved within bounded retry
- repository verification is blocked
- the configured review-round limit is reached
- a phase gate fails
For the simplified small-task flow, the default review limit is two Codex review rounds followed by one Claude Code final revision.
This workflow reduces common failure modes:
- coding before requirements are stable
- planning before the baseline is understood
- reviewing only at the very end
- losing state in chat-only interaction
- letting agents improvise architecture or execution protocol
It creates a controlled loop:
- define
- verify
- plan
- execute
- review
- gate
- integrate
- repeat
Current default mapping:
Codex: planner, PRD writer, plan writer, reviewer, process controllerClaude Code: implementation executor and revision executorGemini: unused in the default v1 flow
Canonical filenames for task execution:
status.mddecision-log.mdhandoffs/10-prd.mdhandoffs/15-prd-reality-review.mdhandoffs/20-user-flow.mdhandoffs/21-user-flow.yamlhandoffs/25-human-approval.mdhandoffs/30-implementation-plan.mdhandoffs/32-execution-workflow.yamlhandoffs/35-plan-review.mdhandoffs/40-execution-prompt.mdhandoffs/95-integration-checklist.mdhandoffs/99-next-cycle.mdsystem/state.json
The shortest reusable form of this workflow is:
- human states intent
- Codex classifies the task and estimates size
- Codex writes the PRD
- Codex reviews the PRD against reality
- Codex writes the user flow
- human approves PRD and user flow
- Codex writes the implementation plan
- Codex reviews the plan
- Codex writes the execution prompt
- Claude Code implements in batches
- Codex reviews each batch and gates phase transitions
- Claude Code performs final revision
- human and Codex integrate, clean up, and define the next cycle