Development Workflow

Purpose

This document is the canonical instruction set for document-driven software delivery using raw terminal sessions.

For the higher-level end-to-end product lifecycle, read docs/build-anything-workflow.md. The machine-readable source of truth lives in docs/canonical-workflow.json, which currently defines 5 phases and 22 stages. This prose file is the human-readable explanation of that execution model.

It exists to minimize drift, ambiguity, and low-quality execution by separating:

planning from implementation
review from coding
durable documents from machine state

This workflow is intentionally tool-light:

terminals only
documents only
git repos only
skills when useful

No external orchestrator is required.

Operating Constraints

Current role mapping:

human: owns goals, priorities, tradeoffs, approvals, and final decisions
Codex: planner, spec writer, reviewer, and process controller
Claude Code: coder and revision executor
Gemini: not used in the default v1 flow

Current environment assumptions:

work happens from raw terminal sessions
all collaboration is recorded in documents
machine lifecycle state may live outside markdown in system/state.json
task execution may use scripts and skills, but documents remain the collaboration surface

Out of scope for this workflow:

Conductor
Agent of Empires
AO / Composio Agent Orchestrator
Paperclip
automatic PR comment sync
CI-driven autonomous retries

Core Philosophy

Script First

Use scripts whenever the task can be deterministic.

Why:

deterministic behavior
no token cost
easier verification

Typical script steps:

create folders
generate diffs
run tests
run lint
validate YAML or JSON
update system/state.json

LLM When Judgment Is Required

Use an LLM step only when the task requires interpretation, synthesis, design, review, or writing that cannot be made purely deterministic.

Typical LLM steps:

task classification
scope estimation
PRD drafting
user-flow drafting
implementation-plan drafting
code review
revision planning

Human Approval Gate for High Risk

Require explicit human approval when:

PRD and user flow need product sign-off
scope expands beyond the approved requirement
the workflow touches secrets, private keys, or production credentials
the workflow attempts destructive actions
the workflow fails to converge after bounded retries

Documents Are the Collaboration Surface

Documents hold:

intent
specifications
plans
handoffs
reviews
approvals
final conclusions

Machine state files hold:

lifecycle counters
current stage
lock state
stop reason

Append-Only Handoffs

Do not rewrite history.

Every agent-to-agent or human-to-agent handoff creates a new document. Old handoffs remain read-only.

Two Workflow Views

This repository now has two valid workflow views:

Build Anything workflow The full product lifecycle: Research -> Design -> Development -> Packaging -> Maintenance
Canonical workflow The currently enforced 22-stage execution model used by scripts and the dashboard

The Build Anything workflow is the strategic model. The canonical workflow is the operational model.

Build Anything Macro Lifecycle

Recommended left-to-right lifecycle:

flowchart LR
    A[Research] --> B[Design]
    B --> C[Development]
    C --> D[Packaging]
    D --> E[Maintenance]

Phase Outputs

Phase	Required result
Research	A concrete anchor on real products, with screenshot evidence and a short recommendation brief
Design	Explicit documents that define product intent, user flow, approval state, and execution plan
Development	Working code plus bounded execution reports, review reports, and final revision evidence
Packaging	Clean integration, final verification, merge readiness, and delivery-facing artifacts
Maintenance	Captured debt, learnings, follow-ups, and the next-cycle candidate

Mapping to the Current Canonical Workflow

Build Anything phase	Canonical workflow mapping
Research	Stages 0-4
Design	Stages 5-12
Development	Stages 13-16
Packaging	Stages 17-19
Maintenance	Stages 20-21

Important:

Research is now a first-class canonical phase with anchor research and evidence collection stages
Packaging and Maintenance are now modeled as distinct canonical phases rather than being collapsed into cleanup

Task Lifecycle

The canonical lifecycle is:

Clarify objective
Classify task and estimate size
Run product research
Collect reference evidence
Research approval gate
Draft PRD
Review PRD against the real codebase or environment
Draft user flow
Draft prototype brief
Design approval gate
Draft implementation plan
Review implementation plan
Write execution prompt for Claude Code
Claude Code executes in batches
Codex reviews each batch
Gate each major phase
Claude Code performs final revision
Integrate and verify
Prepare release package
Delivery approval gate
Capture the next cycle
Update backlog and debt

Default delivery model by task size:

small: may use a simplified two-review flow
medium: should use batch execution and explicit phase gates
large: must use batch execution, phase gates, and more detailed PRD and plan artifacts

Task Classification and Sizing

Every task must be classified before PRD drafting.

Task Type

Use exactly one primary type:

feature
new_project
bug_fix
optimization

Task Size

Use exactly one size:

small
medium
large

Size Rules

small: lean PRD, lean plan, simplified review loop allowed
medium: standard PRD, standard plan, batch execution recommended
large: full PRD, explicit user flow, explicit phase gates, stronger review discipline

Required Skills by Stage

Use these skills when available:

task estimation: agent estimation skill, if available
PRD drafting: Ralph PRD skill
user-flow design: agent-canvas, if available
implementation plan: superpowers:writing-plans

Guidance:

estimation output belongs to scoping, not inside the PRD core body
user flow belongs to the PRD stage and must be approved before implementation planning
implementation planning starts only after approved PRD and approved user flow exist
if a recommended skill is available for the current stage, the agent should use it or explicitly explain why it is being skipped

Stage Contracts

Stage 0: Clarify Objective

Owner: Codex

Step type: ai_routing

Goal:

identify the real unit of work before implementation starts

Required outputs:

clear objective
clear success condition
clear scope boundary

Runtime rules:

this stage is a distinct interaction stage and must not be silently collapsed into later drafting
if objective, success condition, or scope boundary are still ambiguous, Codex must ask the human focused clarification questions and stop
do not proceed to task classification until the clarification answers are either explicit in the human input or captured through a clarification turn
stage 0 completes only after the intake artifact reflects clarified human intent rather than self-inferred assumptions

Stage 1: Classify Task and Estimate Size

Owner: Codex

Step type: ai_routing

Required outputs:

task type
task size
estimate summary
rationale for size

Stage 2: Run Product Research

Owner: Codex

Step type: ai_routing

Required outputs:

primary anchor
secondary anchor
similarity note
recommendation brief

Stage 3: Collect Reference Evidence

Owner: Codex

Step type: ai_routing

Required outputs:

key page screenshots
key flow screenshots
source links

Stage 4: Research Approval Gate

Owner: human

Step type: human_approval_gate

Approval is required for:

primary and secondary anchors
evidence set quality
the recommendation direction that will feed design

Stage 5: Draft PRD

Owner: Codex

Step type: ai_routing

Required PRD contents:

purpose
scope
non-goals
contracts
expected behavior
acceptance criteria
constraints
terminology

Stage 6: Review PRD Against Reality

Owner: Codex

Step type: ai_routing plus script-assisted repo inspection

Purpose:

correct mismatches between the drafted PRD and the real repo or environment

Required outputs:

corrected PRD
contradiction list resolved
explicit baseline assumptions

Stage 7: Draft User Flow

Owner: Codex

Step type: ai_routing

This step must produce both:

a human-readable user flow
a structured YAML version

Structured YAML requirements:

every step must define inputs
every step must define outputs
every step must define validation
every step must define failure
every step must define next

Stage 8: Draft Prototype Brief

Owner: Codex

Step type: ai_routing

This step must produce:

core screens to emulate or prototype
key interactions to preserve
visual direction anchored to the approved research evidence

Stage 9: Design Approval Gate

Owner: human

Step type: human_approval_gate

Approval is required for:

PRD
user flow
prototype brief

Without approval, the workflow must not proceed to implementation planning.

Artifact rules:

handoffs/25-human-approval.md may begin as a pending approval request drafted by Codex
the human decision must overwrite the file with the final approval or revision outcome before implementation planning proceeds
once the gate is pending, machine state should move to status: waiting

Stage 10: Draft Implementation Plan

Owner: Codex

Step type: ai_routing

The plan must translate the approved PRD and approved user flow into:

phases
batches
task order
likely file touchpoints
verification steps
stop conditions

Stage 11: Review Implementation Plan

Owner: Codex

Step type: ai_routing

Purpose:

harden the plan until another agent can execute it with minimal interpretation drift

Required outputs:

reviewed plan
clarified execution order
clarified verification commands
clarified dependency boundaries

Stage 12: Write Execution Prompt

Owner: Codex

Step type: ai_routing

This prompt is a formal handoff, not an informal chat message.

It must define:

repository path
PRD path
implementation plan path
source-of-truth rules
execution order
stop conditions
logging requirements
report format
forbidden behaviors

Stage 13: Claude Code Executes in Batches

Owner: Claude Code

Step type: ai_routing plus scripts

Rules:

do not implement the whole task in one unbounded run
work in small batches
create durable execution reports

Each batch report must contain:

tasks completed
files changed
tests run
result
next proposed batch

Stage 14: Codex Reviews Each Batch

Owner: Codex

Step type: ai_routing

Review must cover:

bugs
regressions
missing tests
contract violations
architectural drift
incorrect assumptions

Batch gate decisions:

proceed
fix_before_proceeding
stop_and_rethink

Stage 15: Gate Each Major Phase

Owner: Codex, with human escalation when needed

Purpose:

prevent downstream work before upstream contracts are verified

Typical phase gates:

backend before UI
infrastructure before workflow logic
data model before feature layer
refactor before product polish

Stage 16: Final Revision

Owner: Claude Code

Rule:

after the second Codex review in the simplified flow, Claude Code performs one final revision
that output is treated as the final version for v1 unless human escalation is required

Stage 17: Integrate and Verify

Owners: human and Codex

Step type: script

Required completion work:

final test verification
final build verification when applicable
branch hygiene
merge readiness check
documented verification evidence

Delivery is not ready until integration is clean and the verification artifacts exist.

Stage 18: Prepare Release Package

Owners: Codex

Step type: ai_routing

Required completion work:

release notes or delivery summary
demo assets or screenshots when applicable
explicit delivery package contents

Stage 19: Delivery Approval Gate

Owners: human

Step type: human_approval_gate

Required outputs:

approval or rejection of the delivery package
explicit release decision

Stage 20: Capture the Next Cycle

Owners: human and Codex

Required reflection topics:

architectural debt
deferred cleanup
legacy removal
product polish
performance improvement opportunities
next spec or plan candidate

Stage 21: Update Backlog and Debt

Owners: Codex

Step type: ai_routing

Required outputs:

backlog updates
debt ledger updates
deferred opportunities captured in durable form

Simplified Small-Task Review Loop

For small tasks, the default bounded loop is:

Codex writes PRD
Codex reviews PRD against reality
Codex writes implementation plan
Codex reviews implementation plan
Codex writes execution prompt
Claude Code implements v1
Codex review round 1
Claude Code revision 1
Codex review round 2
Claude Code final revision

No third Codex review is required in the default small-task v1 flow.

Medium and Large Task Rule

For medium and large tasks:

implementation must proceed in batches
each batch must be reviewed before the next one
major phase boundaries must be explicitly gated

Do not collapse a medium or large task into a single uninterrupted implementation run unless the human explicitly approves that deviation.

Document Contracts

Each live task must have a task directory.

Recommended shape:

tasks/TASK-YYYY-MM-DD-short-name/
  status.md
  decision-log.md
  handoffs/
    00-intake.md
    05-task-classification.yaml
    08-scope-estimate.md
    09-product-research.md
    09-reference-evidence.md
    09-research-approval.md
    10-prd.md
    15-prd-reality-review.md
    20-user-flow.md
    21-user-flow.yaml
    22-prototype-brief.md
    25-human-approval.md
    30-implementation-plan.md
    32-execution-workflow.yaml
    35-plan-review.md
    40-execution-prompt.md
    50-claude-batch-r1.md
    60-codex-review-r1.md
    85-phase-gate.md
    90-claude-final.md
    95-integration-checklist.md
    96-release-package.md
    97-delivery-approval.md
    99-next-cycle.md
    100-backlog-and-debt.md
  system/
    state.json
    run-log.jsonl
    lock

status.md

Purpose:

current summary only

Must include:

current stage
current owner
current round or batch
latest conclusion
blockers
next step

decision-log.md

Purpose:

record durable decisions only

Typical entries:

scope approved
user flow revised
review gate blocked
final revision accepted as delivery artifact

handoffs

Purpose:

primary collaboration artifacts

Rules:

every handoff is append-only
do not overwrite old rounds
each handoff should be self-contained enough for the next actor

system/state.json

Purpose:

machine state only

Expected keys:

status
stage
round
current_actor
last_artifact
stop_reason

Handoff Frontmatter Contract

Every handoff document should begin with machine-friendly frontmatter.

Example:

---
task_id: TASK-2026-03-15-example
author: codex
role: reviewer
round: 1
inputs:
  - handoffs/50-claude-batch-r1.md
status: completed
next_actor: claude
---

Required body sections vary by artifact, but should stay explicit and stable.

Structured Workflow Schema

Every executable step should be expressible in this shape:

id: prd_draft
name: Draft PRD
actor: codex
step_type: ai_routing
inputs:
  - handoffs/00-intake.md
outputs:
  - handoffs/10-prd.md
validation:
  type: schema
  schema: prd_v1
failure:
  on_validation_error: retry_once
  on_semantic_gap: escalate_to_human
next:
  - prd_reality_review

Required step fields:

id
name
actor
step_type
inputs
outputs
validation
failure
next

Allowed step_type values:

script
ai_routing
human_approval_gate

User Flow Outputs

The PRD stage must produce:

user-flow.md
user-flow.yaml

user-flow.md

This is the human-readable product flow.

It should answer:

who the user is
where the user enters
what the user sees
what the user does
how the system responds
what successful completion looks like

user-flow.yaml

This is the structured representation used for validation and later planning.

Recommended shape:

task_id: TASK-2026-03-15-example
flow_name: example-flow
steps:
  - id: entry
    name: User enters workflow
    actor: user
    step_type: script
    goal: Capture initial input
    inputs:
      - raw_request
    outputs:
      - normalized_request
    validation:
      type: schema
      schema: normalized_request_v1
    failure:
      on_validation_error: stop_and_revise
    next:
      - classify_task

Execution Workflow Output

Implementation planning should also produce execution-workflow.yaml.

Purpose:

describe how scripts, LLM steps, and human gates connect during execution

This file is separate from user-flow.yaml.

Why:

user-flow.yaml expresses product and user logic
execution-workflow.yaml expresses delivery logic

Do not collapse them into one file.

Human Approval Gates

A human gate should produce a short approval artifact.

Example:

---
task_id: TASK-2026-03-15-example
author: codex
role: planner
gate: prd_user_flow
status: pending
next_actor: human
---

Suggested body sections:

Decision
Notes
Constraints

After the human decides, update the same file to reflect the final approval state and human ownership.

Required human gates:

PRD approval
user-flow approval

Required escalation gates:

touching secrets or private keys
destructive operations
unexpected scope growth
unresolved contradiction after bounded retries

Terminal Execution Rules

Any terminal agent using this workflow must obey these rules:

Read Development Workflow.md before acting.
Read all prerequisite task documents before acting.
Advance at most one canonical stage per workflow cycle.
When entering a stage, write state first, then write artifacts, then log completion or waiting.
At human gates, stop instead of speculating past approval.
Do not guess missing context.
If required inputs are missing, stop and record the blocker.
Prefer scripts over LLMs.
Produce a new handoff for every meaningful transition.
Update status.md and system/state.json after each stage.
Do not skip validation.
Do not silently expand scope.
Stop on human-gate conditions and wait for approval.

LLM Harness Requirements

Every LLM step must have:

a prompt contract
an input contract
an output contract
a validation contract
a failure contract

Minimum validation expectations:

required file exists
frontmatter parses
required sections exist
YAML parses
enum fields use allowed values
review outputs contain explicit decisions

No LLM step should emit freeform text alone when the output must drive later execution.

Review and Revision Protocol

Review Output Requirements

A review artifact must explicitly state:

findings
severity or priority if relevant
required changes
optional improvements
gate decision

Allowed gate decisions:

proceed
fix_before_proceeding
stop_and_rethink

Revision Output Requirements

A Claude Code revision artifact must explicitly state:

what changed
what files changed
what was intentionally left unchanged
what tests or verification steps ran
any blocker still present

Integration and Cleanup

Before closing a task, confirm:

required tests have run
required build checks have run when relevant
repo state is understandable
merge readiness is documented
temporary work artifacts are cleaned up or intentionally preserved
follow-up items are written down

Stop Conditions and Escalation

Stop the workflow when any of these is true:

approved PRD is missing
approved user flow is missing
required inputs are missing
a high-risk action requires human approval
a validation failure cannot be resolved within bounded retry
repository verification is blocked
the configured review-round limit is reached
a phase gate fails

For the simplified small-task flow, the default review limit is two Codex review rounds followed by one Claude Code final revision.

Why This Workflow Works

This workflow reduces common failure modes:

coding before requirements are stable
planning before the baseline is understood
reviewing only at the very end
losing state in chat-only interaction
letting agents improvise architecture or execution protocol

It creates a controlled loop:

define
verify
plan
execute
review
gate
integrate
repeat

Appendix A: Current Mapping

Current default mapping:

Codex: planner, PRD writer, plan writer, reviewer, process controller
Claude Code: implementation executor and revision executor
Gemini: unused in the default v1 flow

Appendix B: Standard Filenames

Canonical filenames for task execution:

status.md
decision-log.md
handoffs/10-prd.md
handoffs/15-prd-reality-review.md
handoffs/20-user-flow.md
handoffs/21-user-flow.yaml
handoffs/25-human-approval.md
handoffs/30-implementation-plan.md
handoffs/32-execution-workflow.yaml
handoffs/35-plan-review.md
handoffs/40-execution-prompt.md
handoffs/95-integration-checklist.md
handoffs/99-next-cycle.md
system/state.json

Appendix C: Short Reusable Form

The shortest reusable form of this workflow is:

human states intent
Codex classifies the task and estimates size
Codex writes the PRD
Codex reviews the PRD against reality
Codex writes the user flow
human approves PRD and user flow
Codex writes the implementation plan
Codex reviews the plan
Codex writes the execution prompt
Claude Code implements in batches
Codex reviews each batch and gates phase transitions
Claude Code performs final revision
human and Codex integrate, clean up, and define the next cycle

FilesExpand file tree

development-workflow.md

Latest commit

History

development-workflow.md

File metadata and controls

Development Workflow

Purpose

Operating Constraints

Core Philosophy

Script First

LLM When Judgment Is Required

Human Approval Gate for High Risk

Documents Are the Collaboration Surface

Append-Only Handoffs

Two Workflow Views

Build Anything Macro Lifecycle

Phase Outputs

Mapping to the Current Canonical Workflow

Task Lifecycle

Task Classification and Sizing

Task Type

Task Size

Size Rules

Required Skills by Stage

Stage Contracts

Stage 0: Clarify Objective

Stage 1: Classify Task and Estimate Size

Stage 2: Run Product Research

Stage 3: Collect Reference Evidence

Stage 4: Research Approval Gate

Stage 5: Draft PRD

Stage 6: Review PRD Against Reality

Stage 7: Draft User Flow

Stage 8: Draft Prototype Brief

Stage 9: Design Approval Gate

Stage 10: Draft Implementation Plan

Stage 11: Review Implementation Plan

Stage 12: Write Execution Prompt

Stage 13: Claude Code Executes in Batches

Stage 14: Codex Reviews Each Batch

Stage 15: Gate Each Major Phase

Stage 16: Final Revision

Stage 17: Integrate and Verify

Stage 18: Prepare Release Package

Stage 19: Delivery Approval Gate

Stage 20: Capture the Next Cycle

Stage 21: Update Backlog and Debt

Simplified Small-Task Review Loop

Medium and Large Task Rule

Document Contracts

status.md

decision-log.md

handoffs

system/state.json

Handoff Frontmatter Contract

Structured Workflow Schema

User Flow Outputs

user-flow.md

user-flow.yaml

Execution Workflow Output

Human Approval Gates

Terminal Execution Rules

LLM Harness Requirements

Review and Revision Protocol

Review Output Requirements

Revision Output Requirements

Integration and Cleanup

Stop Conditions and Escalation

Why This Workflow Works

Appendix A: Current Mapping

Appendix B: Standard Filenames

Appendix C: Short Reusable Form