Objective
Test end-to-end evaluation execution workflow including running evaluations, viewing results, versioning, and re-running.
Tasks
Basic Execution
Results & Metrics
Versioning
Advanced Features
Technical Context
Files to verify:
- UI:
packages/ui/src/views/evaluations/index.jsx
- UI:
packages/ui/src/views/evaluations/CreateEvaluationDialog.jsx
- UI:
packages/ui/src/views/evaluations/EvaluationResult.jsx
- UI:
packages/ui/src/views/evaluations/EvaluationResultSideDrawer.jsx
- Service:
packages/server/src/services/evaluations/index.ts
- Entity:
packages/server/src/database/entities/Evaluation.ts
- Entity:
packages/server/src/database/entities/EvaluationRun.ts
Acceptance Criteria
Objective
Test end-to-end evaluation execution workflow including running evaluations, viewing results, versioning, and re-running.
Tasks
Basic Execution
Results & Metrics
Versioning
Advanced Features
Technical Context
Files to verify:
packages/ui/src/views/evaluations/index.jsxpackages/ui/src/views/evaluations/CreateEvaluationDialog.jsxpackages/ui/src/views/evaluations/EvaluationResult.jsxpackages/ui/src/views/evaluations/EvaluationResultSideDrawer.jsxpackages/server/src/services/evaluations/index.tspackages/server/src/database/entities/Evaluation.tspackages/server/src/database/entities/EvaluationRun.tsAcceptance Criteria