Skip to content

Fail fast on unsupported features in the async execution lane #332

@dgenio

Description

@dgenio

Summary

Make the async execution lane reject — loudly and early — any flow that uses features it does not support, instead of executing with silently different semantics. Then close the parity gaps over time.

Why this matters

A flow that branches differently (or not at all) depending on which entry point ran it undermines the project's core determinism promise. Users who switch from execute_flow to execute_flow_async should get either identical behavior or an immediate, descriptive error — never a silent semantic change.

Current evidence

  • _assert_async_lane_supported() (executor.py ~lines 1502–1534) rejects some unsupported constructs (e.g., composed flow_name steps), but the async DAG path (~lines 1655–1845) contains no branch-selection logic: a DAGFlow whose steps declare branches / default_next executes without conditional routing in the async lane while the sync lane (~lines 4065+) applies it.
  • Decision callbacks and step cache/checkpoint resume are also documented as sync-only (AGENTS.md key entry points), but enforcement coverage varies by feature.

Proposed implementation

  1. Inventory every executor feature and classify it: supported in async, unsupported, or silently degraded. Candidates to verify: conditional branches, default_next, decision_candidates, step cache, checkpointer, replay/resume, sub-flow composition, streaming.
  2. Extend _assert_async_lane_supported() (or its successor after the sync/async consolidation) to raise a typed ChainWeaverError subclass (e.g., AsyncLaneUnsupportedError or reuse an existing fit) listing the specific unsupported constructs found in the flow, before any step runs.
  3. Document the support matrix in AGENTS.md and the docs site (a simple table: feature × sync/async).
  4. File follow-up work to implement true parity for the highest-value gaps (branching first), tracked separately.

Acceptance criteria

  • Executing a branching DAGFlow via execute_flow_async raises a descriptive typed error before the first step runs (until branching parity is implemented).
  • Every unsupported feature in the matrix has a corresponding early-rejection test.
  • The sync/async feature support matrix is published in the docs.

Test plan

  • New tests in tests/test_executor_async.py asserting early rejection for each unsupported construct.
  • Existing async tests pass unchanged.
  • Full validation commands pass.

Migration notes

Behavior change: async executions of branching DAG flows that previously "ran" (without conditional routing) will now raise. This is a correctness fix; release notes should call it out and point users at the sync lane or at the parity roadmap. Pre-1.0 SemVer allows this in a minor release per docs/versioning-policy.md.

Risks and tradeoffs

  • Some users may depend on the current (incorrect) async behavior; failing fast surfaces that immediately, which is the intent.
  • Slight upfront validation cost per async execution — negligible relative to tool invocation.

Suggested labels

reliability, breaking-change

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions