Skip to content

test(routing-eval): add 6 data_science cases (PR #10/#11 regression line)#13

Merged
DONGRYEOLLEE1 merged 1 commit into
mainfrom
test/routing-eval-data-science-cases
May 22, 2026
Merged

test(routing-eval): add 6 data_science cases (PR #10/#11 regression line)#13
DONGRYEOLLEE1 merged 1 commit into
mainfrom
test/routing-eval-data-science-cases

Conversation

@DONGRYEOLLEE1
Copy link
Copy Markdown
Owner

Summary

routing_eval golden dataset를 12 → 18 cases로 확장. data_science 카테고리에 6 cases 신규 추가하여 본 세션의 시각화/분기 회귀(PR #10, #11)를 정량 측정으로 차단.

New cases (S-A~S-F 그대로 라벨)

  • data-002 trend.csv line chart
  • data-003 products.json category bar
  • data-004 multi_sheet.xlsx sheet 비교
  • data-005 sales.csv + metrics.csv 다중 파일
  • data-006 korean_sales.csv CJK 라벨
  • data-007 corrupt.pdf 우아한 실패

Plan §4.0 P5 강화

회귀는 evaluation harness로 측정. scorer가 신규 cases도 top-1 평가.

Verification

  • pytest 316/316 + routing_eval 6/6 PASS
  • 회귀 0

🤖 Generated with Claude Code

…ine)

routing_eval golden dataset를 12 → 18 cases로 확장. data_science 카테고리에
신규 6 cases 추가하여 본 세션의 시각화/분기 회귀를 정량 측정으로 차단.

신규 cases:
- data-002 trend line CSV → data_science_team
- data-003 JSON category bar chart
- data-004 xlsx multi-sheet 비교
- data-005 multi-file 비교
- data-006 CJK 라벨 차트
- data-007 corrupt PDF 우아한 실패

plan §4.0 P5 (회귀는 evaluation harness로) 강화. CODEBASE_WIDE_REFACTORING_PLAN
§2.8 항목에 18 cases + 카테고리 10종 + data_science 확장 메모 반영.

검증: pytest 316/316 + routing_eval 6/6 PASS, 회귀 0.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown

vercel Bot commented May 22, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
orchagent Ready Ready Preview, Comment May 22, 2026 3:45am
project-vdajw Ready Ready Preview, Comment May 22, 2026 3:45am

@DONGRYEOLLEE1 DONGRYEOLLEE1 merged commit 8a9e573 into main May 22, 2026
6 of 7 checks passed
@DONGRYEOLLEE1 DONGRYEOLLEE1 deleted the test/routing-eval-data-science-cases branch May 22, 2026 03:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant