Skip to content

v0.8.63: Token-budget governor for Workflows / high-fanout Agent runs #3319

@Hmbown

Description

@Hmbown

Problem

The concurrency cap's real protective value is pacing token spend, not just counting Agents. A wide or recursive Workflow fan-out can burn a large token budget very quickly — in a live test, 20 trivial one-word Agents still consumed ~174k tokens in ~9s. Count caps are a poor proxy for spend, and there is currently no spend ceiling that paces or halts a fan-out when a token budget is exhausted. This is the safety lever that makes raising fan-out width (companion admission/queue issue) sane.

Current state / evidence

  • Sub-agent limits today are count-based (max_concurrent, max_depth) and time-based (api_timeout_secs, heartbeat_timeout_secs) — see the knob list in v0.8.63: Expose editable sub-agent recursion and concurrency controls #3304. None of them governs aggregate token spend across a fan-out.
  • Per-Agent usage is already observable: worker records capture usage provenance (docs/SUBAGENTS.md worker-record schema), so aggregate accounting across a run is feasible without new plumbing at the leaf.

Proposed direction

  • Add an optional aggregate token budget for a parent / Workflow run (config key + per-invocation override).
  • Track aggregate spend across the whole run, including nested Agents (shared pool across root + descendants), mirroring the Workflow budget model (total / spent / remaining).
  • When remaining budget falls below a spawn's threshold, pause or block further spawns with a clear, actionable message rather than silently overspending.
  • Expose spent vs remaining in status surfaces.

Acceptance criteria

  • An optional token budget can be set for a parent/Workflow run (config + per-invocation override).
  • Aggregate spend is tracked across the run, including nested Agents (shared accounting across depth).
  • When remaining budget is below threshold, further spawns are blocked/paused with a clear message instead of overspending.
  • Status surfaces show spent vs remaining.
  • With no budget set, behavior is unchanged (no ceiling).
  • Tests cover budget exhaustion blocking spawns and shared accounting across recursion depth.

Non-goals

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationImprovements or additions to documentationenhancementNew feature or requestreliabilityReliability, flaky behavior, retries, fallbacks, and robustnesssubagentsSub-agent orchestration, lifecycle, and completion handlingv0.8.63Targeting v0.8.63workflow-runtimeWorkflow IR, executor, control flow, and replay runtime

    Projects

    Status
    Backlog

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions