Skip to content

Enforce Auto Review dedupe freshness supersede and cancellation policy #329

@shiny-code-bot

Description

@shiny-code-bot

Summary

Turn durable run state into actual token-saving and correctness policy.

Scope

  • Compute stable review scope/diff/prompt-policy keys.
  • Adopt/reuse active matching runs before launch.
  • Classify current, superseded, obsolete, inactive, lost, failed, skipped, and cancelled runs.
  • Cancel only for explicit user stop, duplicate, obsolete/superseded scope, dead/lost process, hard budget exhaustion, or policy-backed inactivity.
  • Never cancel solely because elapsed runtime is high.

Acceptance Criteria

  • Duplicate review launches for the same useful scope are skipped, reused, or adopted before creating another worktree/agent.
  • Diff fingerprints are populated for Auto Review runs and used with snapshot/head/epoch evidence to recognize equivalent review scope even when HEAD changed.
  • Superseded/obsolete classification uses snapshot/head/epoch evidence, not elapsed runtime alone.
  • Runtime is tracked for cost and diagnostics but never by itself marks a healthy active review stale.
  • Cancellation records an explicit reason and releases relevant locks/worktree state.
  • Follow-up review loops short-circuit on equivalent diffs, including cases where a failed or reverted fix changes commit ids without changing the effective reviewed diff.
  • Tests cover duplicate avoidance, superseded classification, inactivity/lost handling, non-cancellation of healthy long-running reviews, and non-mutating idle checks.

Relationships

Parent: #324
Depends on: #325, #326
Related: #50

Finish Line

Every Code adopts or reuses matching live Auto Reviews before launching new ones, classifies superseded/obsolete runs from snapshot and activity evidence, and cancels only through explicit harness-owned policy.

Current Status

State: In PR review/checks.

PR #337 implements the first dedupe policy slice: Auto Review now persists a stable base+diff SHA-256 fingerprint plus changed-path metadata, skips launching duplicate reviews once an equivalent review is actually running, reuses completed equivalent reviews, and preserves duplicate-skipped ledger rows when the synthetic completion event is recorded.

A read-only reviewer caught two P1 adoption hazards before merge: duplicate in-flight skips were being surfaced as clean completions, and Pending rows could be mutually adopted before either session launched a model run. The PR now fixes both: only Reviewing/Resolving rows are adoptable, and duplicate-skip completions no longer carry a snapshot marker that would mark the target as reviewed.

Validation completed locally after the reviewer fixes:

  • cargo test -p code-core duplicate_lookup --lib
  • cargo test -p code-tui duplicate_skipped_background_review_does_not_mark_snapshot_reviewed --features test-helpers
  • cargo test -p code-tui auto_review_diff_fingerprint --features test-helpers
  • ./build-fast.sh

Next action: wait for PR #337 GitHub checks/review, merge when clean, then rebuild the PATH code binary and dogfood whether equivalent auto reviews now skip without falsely reporting clean completion.

Blocked by: PR #337 checks/review.

Last verified: 2026-06-02 after pushing commit dae7ad3bac to PR #337.

Metadata

Metadata

Assignees

No one assigned

    Labels

    planDurable planning issueplan:activeCurrent active plan

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions