Skip to content

fix(routing): stop vision_team reviewer↔analyst infinite loop#20

Merged
DONGRYEOLLEE1 merged 1 commit into
mainfrom
fix/vision-team-reviewer-loop
May 25, 2026
Merged

fix(routing): stop vision_team reviewer↔analyst infinite loop#20
DONGRYEOLLEE1 merged 1 commit into
mainfrom
fix/vision-team-reviewer-loop

Conversation

@DONGRYEOLLEE1
Copy link
Copy Markdown
Owner

Summary

Playwright UI smoke on `/chat` revealed that vision_team routing worked correctly (head → vision_team → vision_analyst) but the response never terminated: reviewer kept critiquing "text not transcribed" / "labels unreadable" and the team supervisor kept re-dispatching vision_analyst with no safeguard ever firing. Two missing pieces, both fixed here.

Root causes

  1. vision_team had no dispatch ceiling — `apps/backend/workflow/teams/vision.py` built its graph with `with_validator=True` but did not pass `max_team_dispatches`. Every other team already wires this from `settings`. `enforce_dispatch_limit` was therefore never armed for the vision layer.
  2. REVIEWER_PROMPT had no vision-team policy — only Data Science had stopping rules. The reviewer had nothing telling it that OCR-impossible regions, OCR-unrequested user intent, repeated critiques, or chart-tick-illegibility are not grounds for another retry.

Fix

  • `core/config.py`: `VISION_TEAM_MAX_DISPATCHES = 3` (single-worker team — one good pass + at most one reviewer-driven retry).
  • `workflow/teams/vision.py`: `build(with_validator=True, max_team_dispatches=settings.VISION_TEAM_MAX_DISPATCHES)`.
  • `prompt-kit/prompts.py` `REVIEWER_PROMPT` v1.2 → v1.3 with a new `# VISION TEAM — STOPPING RULES` block:
    • "확인 불가" / "판독 불가" markers from vision_analyst are legitimate complete answers
    • "describe / 해석해줘" requests do NOT imply OCR-grade transcription (only "그대로", "verbatim", etc. do)
    • Repeating the same critique twice is a loop, not progress
    • Chart TYPE identification + qualitative insight on visible magnitudes satisfies a "chart interpretation" request
    • Hard ceiling: after TWO vision_analyst attempts mark VALID regardless

Regression locks

  • `test_team_subgraphs.py` — extended the parametrized `test_team_graphs_use_configured_dispatch_limits` to cover vision_team; pins `settings.VISION_TEAM_MAX_DISPATCHES` → builder wiring.
  • `test_routing_prompts.py` — new `test_reviewer_prompt_contains_vision_stopping_rules` locks the five escape hatches.

Test plan

  • `pytest tests -q` → 193 passed (191 baseline + 2 new regression locks)
  • Playwright re-run of the 3 vision scenarios (V-A / V-B / V-C) against merged main to confirm all three now terminate with sensible answers.

🤖 Generated with Claude Code

Playwright UI smoke (V-B/V-C scenarios on /chat) showed vision_team
routing was correct but the response never terminated: reviewer kept
critiquing "text not transcribed" or "chart labels unreadable", the team
supervisor kept re-dispatching vision_analyst, and no safeguard ever
fired. Two missing pieces, both lifted into this PR:

1. `apps/backend/workflow/teams/vision.py` was building its graph with
   `with_validator=True` but did not pass `max_team_dispatches`, so
   `enforce_dispatch_limit` was never armed for the vision layer.
   Every other team already wires this from settings — vision was the
   odd one out. Added `settings.VISION_TEAM_MAX_DISPATCHES = 3`
   (single-worker team, two attempts is plenty).
2. `REVIEWER_PROMPT` (v1.2 → v1.3) had stopping rules only for the Data
   Science team. The reviewer had no policy telling it that:
   - "확인 불가" / "판독 불가" markers from vision_analyst are
     legitimate complete answers (the model literally cannot resolve
     sub-pixel text);
   - "describe / 해석해줘" requests do NOT imply OCR-grade transcription
     (only "그대로", "verbatim", etc. do);
   - repeating the same critique twice is a loop, not progress;
   - identifying chart TYPE + giving qualitative insight on visible
     magnitudes is enough for a chart-interpretation request;
   - after two vision_analyst attempts mark VALID regardless.

- tests/test_team_subgraphs.py: extended the parametrized
  `test_team_graphs_use_configured_dispatch_limits` to cover vision_team
  — pins the wiring from `settings.VISION_TEAM_MAX_DISPATCHES` into the
  builder.
- tests/test_routing_prompts.py: new
  `test_reviewer_prompt_contains_vision_stopping_rules` locks the five
  escape hatches so prompt drift can't silently bring the loop back.

Validation:
- pytest tests -q → 193 passed (191 baseline + 2 new regression locks).

Plans / context:
- The V-B / V-C reproduction is in this session's Playwright run; goal
  follow-up will rerun the three scenarios against the merged main.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown

vercel Bot commented May 25, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
orchagent Ready Ready Preview, Comment May 25, 2026 9:42am
project-vdajw Ready Ready Preview, Comment May 25, 2026 9:42am

@DONGRYEOLLEE1 DONGRYEOLLEE1 merged commit 50d6ed7 into main May 25, 2026
5 checks passed
@DONGRYEOLLEE1 DONGRYEOLLEE1 deleted the fix/vision-team-reviewer-loop branch May 25, 2026 09:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant