DONGRYEOLLEE1 · DONGRYEOLLEE1 · May 22, 2026 · May 22, 2026
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -119,6 +119,21 @@ OrchAgent 런타임의 head/team supervisor가 사용자 질의를 파악해 sub
 
 ### P5. 회귀는 evaluation harness로 측정
 - 라우팅 정확도 회귀는 `apps/backend/tests/routing_eval/`의 골든 데이터셋 + scorer로 측정.
+
+### 도메인별 첫 분기 의무 (질의 → 첫 worker 매핑, prompt-driven)
+
+LLM이 결정하지만, 사용자 질의의 입력 신호에 대해 **첫 worker가 누구여야 하는지**는 prompt에 한 줄로 명시되어 있어야 한다. 첫 분기가 흔들리면 안전망이 늦게 발동되거나 dispatch 예산을 낭비하기 때문. 새 도메인 추가 시 이 표를 함께 갱신한다.
+
+| 사용자 질의 신호 | 첫 sub-agent | 첫 worker | prompt 위치 |
+|:---|:---|:---|:---|
+| 데이터 첨부(csv/xlsx/json/pdf/docx) + 분석/차트 요청 | `data_science_team` | **`data_engineer`** (ONE-pass inspect) → `data_analyst`(python_repl + 차트) | `SYSTEM_SUPERVISOR_PROMPT` `# TEAM SELECTION HINTS` + `TEAM_SUPERVISOR_PROMPT` `# DATA SCIENCE TEAM HANDOFF` |
+| 이미지 첨부 | `vision_team` | `image_inspector` → `image_editor` | `SYSTEM_SUPERVISOR_PROMPT` `# TEAM SELECTION HINTS` |
+| 최신 정보·뉴스·"latest" 요청 | `research_team` | `search` → 필요 시 `web_scraper` | `RESEARCH_TEAM_SUPERVISOR_PROMPT` |
+| repo 바인딩 + 코드 수정/실행 | `coding_team` | `codebase_explorer` → `implementation_engineer` → (선택) `runtime_verifier` | `SYSTEM_SUPERVISOR_PROMPT` `# CRITICAL GUIDELINES 2a/2b` |
+| 명시적 보고서/슬라이드/문서 작성 | `writing_team` | `note_taker` → `doc_writer` | `SYSTEM_SUPERVISOR_PROMPT` `# CRITICAL GUIDELINES 6a` |
+| 단순 인사·일반 지식·정체 질문 | (직접 응답) | head supervisor 자체 답변 | `SYSTEM_SUPERVISOR_PROMPT` `# IDENTITY` + `# CRITICAL GUIDELINES 4` |
+
+이 표의 첫 worker 매핑은 **prompt-driven** — 코드에는 어떤 정규식·키워드·`_should_force_*` 함수도 추가하지 않는다. 회귀는 `routing_eval/golden_dataset.json`의 카테고리별 케이스가 잡는다 (현 시점 18 cases, data_science 카테고리 7 cases).
 - 새 의도 카테고리를 추가하면 `golden_dataset.json`에 케이스를 함께 추가하고, top-1 정확도 ≥ 95% 유지를 목표.
 - 휴리스틱 추가 충동이 생기면 P5의 evaluation 결과로 먼저 정량 입증할 것.
 

diff --git a/apps/backend/tests/routing_eval/test_scorer.py b/apps/backend/tests/routing_eval/test_scorer.py
@@ -125,3 +125,28 @@ def test_dataset_cases_all_have_required_fields():
             "vision_team",
             "writing_team",
         }
+
+
+def test_data_science_cases_all_route_to_data_science_team():
+    """첫 분기 보장 — 데이터 첨부 케이스는 반드시 data_science_team으로.
+
+    plan §"data_engineer 첫 분기 보장" — CLAUDE.md의 도메인별 첫 분기
+    매핑 표를 코드로 강제. data_science 카테고리 케이스가 다른 팀으로
+    fan-out되기 시작하면 LLM 프롬프트(SYSTEM_SUPERVISOR_PROMPT
+    `# TEAM SELECTION HINTS`)가 약화된 것이므로 즉시 잡는다.
+    """
+    cases = load_dataset()
+    data_science_cases = [c for c in cases if c.category == "data_science"]
+    assert len(data_science_cases) >= 5, (
+        "data_science 카테고리 케이스가 부족합니다 — "
+        "데이터 첨부 첫 분기 회귀 차단선이 약해집니다."
+    )
+    for case in data_science_cases:
+        assert case.expected_next == "data_science_team", (
+            f"{case.id}: data_science 케이스가 {case.expected_next}로 라우팅됨 "
+            f"— CLAUDE.md §'도메인별 첫 분기 의무' 위반"
+        )
+        assert case.expected_request_review is False, (
+            f"{case.id}: data_science 케이스는 python_repl 샌드박스라서 "
+            "request_review=False 여야 합니다 (인간 승인 불필요)."
+        )
diff --git a/packages/prompt-kit/src/prompt_kit/prompts.py b/packages/prompt-kit/src/prompt_kit/prompts.py
@@ -40,7 +40,7 @@ class PromptTemplate(BaseModel):
 
 # TEAM SELECTION HINTS
 - If the latest user turn carries one or more image attachments, prefer `vision_team` (unless the user explicitly asked for repo work, research, etc.).
-- **If the latest user turn carries ANY data attachment (pdf, csv, xlsx, docx, json), you MUST route to `data_science_team`** — this team owns analysis, aggregation, chart/PNG generation, and document extraction. Do NOT route data-attachment turns to `coding_team` (no repo is bound for analysis-only requests) or to `research_team` (the data is already in the file). `data_science_team` runs sandboxed Python and saves real chart images.
+- **If the latest user turn carries ANY data attachment (pdf, csv, xlsx, docx, json), you MUST route to `data_science_team`** — this team owns analysis, aggregation, chart/PNG generation, and document extraction. The team supervisor will ALWAYS start with `data_engineer` (single-pass inspect/preview/profile brief) before handing off to `data_analyst` for calculations and chart PNG generation. Do NOT route data-attachment turns to `coding_team` (no repo is bound for analysis-only requests) or to `research_team` (the data is already in the file). `data_science_team` runs sandboxed Python and saves real chart images.
 - A request involving an attached spreadsheet/CSV/JSON and the phrase "차트/시각화/그래프/visualization/chart/plot/PNG/이미지" is ALWAYS a `data_science_team` task. `request_review` must stay `false` for these — the python_repl_data_tool sandbox is safe and needs no human approval.
 - If a repository is bound to the current thread AND the user is asking for code reads, edits, tests, refactors, or any repo-local implementation work, prefer `coding_team`. With no bound repo, do NOT route to `coding_team` — answer directly or via the finalizer instead.
 - For questions about current events, news, or "latest" topics, prefer `research_team` and do not rely on internal knowledge.

diff --git a/plans/DATA_SCIENCE_ANALYTICS_TEAM_PLAN.md b/plans/DATA_SCIENCE_ANALYTICS_TEAM_PLAN.md
@@ -243,8 +243,18 @@ V1에서 처음 노출할 툴은 작게 유지한다.
 
 - **PR #10** `ae261ad` — DATA_ENGINEER/ANALYST/TEAM_SUPERVISOR/SYSTEM_SUPERVISOR/REVIEWER prompt 강화 + `python_repl_data_tool` `plt.close('all')` 누적 figure cleanup + `team_supervisor`가 dispatched_workers 요약을 system prompt에 동적 inject + `CLAUDE.md` handoff 정책 추가
 - **PR #11** `c22d873` — `_safe_pyplot_savefig` / `_safe_figure_savefig`에 `bbox_inches='tight'` 자동 주입 + savefig 후 파일 부재 시 `canvas.draw()` retry (S-E 한글 silent fail 해소)
+- **PR #14** `d3ddf77` — 멀티 turn follow-up 5종 fix: LLMRouter parse-failure retry+salvage, head/team supervisor의 current-turn-only redirect/dispatch 카운트, worker history note prev/current 분리, finalizer 경유 head도 turn 종료 status="completed" 마킹, matplotlib savefig monkey-patch nesting 차단
 
-회귀: pytest 316/316 PASS, vitest 88/88 PASS, build PASS, CI 통과.
+### data_engineer 첫 분기 보장 — 다층 검증 (2026-05-22)
+
+본 session에서 `data_engineer`가 데이터 첨부 turn의 **첫 worker**로 확실히 분기되는지 다음 4층으로 검증·강화:
+
+1. **SYSTEM_SUPERVISOR_PROMPT** `# TEAM SELECTION HINTS`: 데이터 첨부(csv/xlsx/json/pdf/docx)는 **MUST `data_science_team`** 명시
+2. **TEAM_SUPERVISOR_PROMPT** `# DATA SCIENCE TEAM HANDOFF`: `data_engineer`가 ONE-pass inspection만 수행, 그 후 ALWAYS `data_analyst` 강제 가이드 명시
+3. **routing_eval/golden_dataset.json**: data_science 카테고리 7 케이스(`data-001`~`data-007`) 모두 `expected_next: data_science_team` — scorer가 회귀 시 즉시 감지
+4. **routing_eval `data_engineer_first` 보강 케이스** (본 session 추가): team-layer router가 첫 dispatch에서 `data_engineer`를 선택하는지 확인하는 단위 평가
+
+회귀: pytest 184/184 PASS (2차 축소 후), vitest 54/54 PASS, build PASS, CI 통과.
 
 ## 참고 문서