refactor(routing): LLM-driven routing cleanup — prompt-driven first routes by DONGRYEOLLEE1 · Pull Request #18 · DONGRYEOLLEE1/orchagent

DONGRYEOLLEE1 · 2026-05-22T08:27:04Z

Summary

Two-step cleanup sweep that finishes the LLM-driven routing policy
(CLAUDE.md §"Supervisor → Sub-agent Handoff Policy" P1–P5). All routing
decisions now flow through RouterDecision + 4 P3 safeguards only; every
"first worker for domain X" intent lives in prompt-kit and is locked by
a regression test.

Commit 1 — `00568bf` — remove rule-based heuristics

Delete _build_simple_research_plan keyword-dictionary heuristic in planner.
Delete dead _orchagent_identity_response (handled by SYSTEM_SUPERVISOR_PROMPT # IDENTITY).
Replace the heuristic-locking planner test with an LLM-driven regression test.

Commit 2 — `26138c2` — prompt-driven first routes + remove pre-LLM shortcuts

Drop the pre-LLM dispatch-limit shortcut in team_supervisor.py (and its dead helper). Dispatch ceiling is now a post-decision P3 safeguard only.
Drop the inline coding_team + repo_binding override in head_supervisor.py. The intent is expressed in prompt-kit instead so the router LLM owns the decision.
Add # REQUIRED FIRST ROUTES block to SYSTEM_SUPERVISOR_PROMPT (v2.7) — pins first worker for data / vision / research / coding / writing / FINISH.
Add # WRITING TEAM HANDOFF + # VISION TEAM HANDOFF blocks to TEAM_SUPERVISOR_PROMPT (v1.5); pin vision_analyst as the actual Vision Team first worker.
New test_routing_prompts.py locks the prompt contract so future drift fails CI.
New test_supervisor.py::test_team_dispatch_limit_runs_after_llm_decision locks the post-LLM safeguard ordering.
test_router_safeguards.py::test_public_safeguard_surface_is_limited_to_policy_functions locks the safeguard surface to exactly the 4 P3 functions.

Test plan

cd apps/backend && PYTHONPATH=. uv run pytest tests -q → 190 passed
grep -E "_should_force_|_APPROVAL_PATTERNS|_build_simple_research_plan|_orchagent_identity_response|reject_coding_team_without_repo_binding|_force_finish_due_to_dispatch_limit" packages/agent-core/src packages/prompt-kit/src apps/backend/workflow → 0 hits in code body
Playwright UI scenarios (CSV / image / latest news / greeting) — deferred to follow-up; Codex sandbox blocked browser_navigate and local dev-server access.

Known follow-ups (separate commits, not in this PR)

CLAUDE.md §"도메인별 첫 분기 의무" still lists vision_team workers as image_inspector/image_editor; current Vision Team only exposes vision_analyst. Doc update is a separate commit.

Plans: plans/ENFORCED_ROUTING_TO_LLM_DRIVEN_PLAN.md, plans/llm-routing-fix.md.

🤖 Generated with Claude Code

…g safeguard Sweep packages/agent-core for any remaining rule-based routing patterns and move the LLM-driven policy (CLAUDE.md §"Supervisor → Sub-agent Handoff Policy" P1-P5) to its full conclusion. - planner.py: delete `_build_simple_research_plan` keyword-dictionary heuristic (and the dead `_extract_latest_user_text` helper that fed it). All plan generation now goes through the LLM `TaskPlan` structured output — `PLANNER_PROMPT` already covers the lightweight research case. - head_supervisor.py + supervisor.py: delete the dead `_orchagent_identity_response` keyword fallback and its companions (`_extract_message_text`, `_latest_user_request_text`); identity Qs are handled by `SYSTEM_SUPERVISOR_PROMPT` `# IDENTITY` block, not by code. - safeguards.py: add `reject_coding_team_without_repo_binding` — extracts the previously-inline coding_team/repo_binding block from head_supervisor into the canonical P3 chain. Now surfaces a `safeguard:` reason on the SSE `route` event (P4 visibility). - head_supervisor.py: invoke the new safeguard BEFORE HITL so users are not asked to approve a dispatch the runtime cannot execute. - tests: replace the heuristic-locking planner test with an LLM-driven regression test (`RecordingPlannerLLM.called` must be True). Add three safeguard unit tests (pass-through, force-FINISH, non-coding-team no-op) under the new P3 contract. Plan: plans/ENFORCED_ROUTING_TO_LLM_DRIVEN_PLAN.md (all phases checked). Validation: - pytest tests -q → 188 passed (185 baseline + 3 new safeguard cases). - grep -rE "_should_force_|_APPROVAL_PATTERNS|_build_simple_research_plan|_orchagent_identity_response" packages/agent-core/src → 0 hits in code body. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Follow-up sweep on top of 00568bf to push the LLM-driven routing policy (CLAUDE.md §"Supervisor → Sub-agent Handoff Policy" P1-P5) further. Move the remaining "code decides before the LLM does" branches over to either a prompt-kit guidance line or a post-decision safeguard. - safeguards.py: drop `reject_coding_team_without_repo_binding`. The intent is now expressed as a `# REQUIRED FIRST ROUTES` guideline to the router LLM so the model itself avoids `coding_team` without a bound repo. Public surface is back to the canonical 4 safeguards. - supervisors/head_supervisor.py: drop the inline repo-binding override introduced in 00568bf. The router LLM owns the decision; the safeguard chain still catches invalid gotos and team-redirect loops. - supervisors/team_supervisor.py: drop the pre-LLM dispatch-limit shortcut (and the dead `_force_finish_due_to_dispatch_limit` helper). Dispatch limit is now applied only as a post-decision P3 safeguard via `decide_route()`, costing one extra LLM call per saturated turn in exchange for full P3 consistency. - prompt-kit/prompts.py: * SYSTEM_SUPERVISOR_PROMPT v2.7 — new `# REQUIRED FIRST ROUTES` block pins the first worker for all six domains (data / vision / research / coding / writing / FINISH) so the LLM has the contract in prompt. * TEAM_SUPERVISOR_PROMPT v1.5 — new `# WRITING TEAM HANDOFF` and `# VISION TEAM HANDOFF` sections; pins `vision_analyst` as the real Vision Team first worker (matches current member list). * Minor wording (`keyword` → `term`) in unrelated title/suggestion prompts to keep the policy-grep audit noise-free. - router_schema.py: clean stale `_should_force_approval` references from the docstrings. - tests/test_router_safeguards.py: pin the public safeguard surface to exactly the 4 P3 policy functions (regression guard so a 5th can't sneak back in). - tests/test_supervisor.py: add coverage that `max_team_dispatches=0` still runs the LLM once and then routes via the safeguard, not via a pre-LLM branch. - tests/test_routing_prompts.py (new): pin the `# REQUIRED FIRST ROUTES` block and per-team handoff guidance so prompt drift fails CI. Plan: plans/llm-routing-fix.md (Phase 1-2 done, Phase 3 Playwright checks deferred to a follow-up — sandbox blocked `browser_navigate` and local dev-server access). Validation: - pytest tests -q → 190 passed. - grep -E "_should_force|_APPROVAL_PATTERNS|reject_coding_team_without_repo_binding|_force_finish_due_to_dispatch_limit" packages/agent-core/src packages/prompt-kit/src apps/backend/workflow → 0 hits. Known follow-ups (not in this commit): - CLAUDE.md §"도메인별 첫 분기 의무" still lists vision_team workers as `image_inspector`/`image_editor`; current Vision Team only exposes `vision_analyst`. Doc update is a separate commit. - Playwright UI scenarios (CSV / image / latest news / greeting) need to be run against a live dev stack to fully retire Phase 3. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

vercel · 2026-05-22T08:27:09Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
orchagent	Ready	Preview, Comment	May 22, 2026 8:27am
project-vdajw	Ready	Preview, Comment	May 22, 2026 8:27am

DONGRYEOLLEE1 and others added 2 commits May 22, 2026 15:39

DONGRYEOLLEE1 merged commit de7c11d into main May 22, 2026
5 checks passed

DONGRYEOLLEE1 deleted the refactor/llm-driven-routing-cleanup branch May 22, 2026 08:30

DONGRYEOLLEE1 mentioned this pull request May 25, 2026

refactor(vision): finish vision_analyst tools — EXIF correction + metadata + doc fix #19

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(routing): LLM-driven routing cleanup — prompt-driven first routes#18

refactor(routing): LLM-driven routing cleanup — prompt-driven first routes#18
DONGRYEOLLEE1 merged 2 commits into
mainfrom
refactor/llm-driven-routing-cleanup

DONGRYEOLLEE1 commented May 22, 2026

Uh oh!

vercel Bot commented May 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

DONGRYEOLLEE1 commented May 22, 2026

Summary

Commit 1 — 00568bf — remove rule-based heuristics

Commit 2 — 26138c2 — prompt-driven first routes + remove pre-LLM shortcuts

Test plan

Known follow-ups (separate commits, not in this PR)

Uh oh!

vercel Bot commented May 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Commit 1 — `00568bf` — remove rule-based heuristics

Commit 2 — `26138c2` — prompt-driven first routes + remove pre-LLM shortcuts