Summary
This release adds forceful query termination, comprehensive cost tracking across all pipeline stages, and several bug fixes.
What's Changed
Added
- Query Termination:
Query.close()method for forceful query termination (#340) - Comprehensive Cost Tracking: Stage-level cost breakdown in EvaluationOutput (#326)
- Total Cost Aggregation: All stage costs aggregated into total_cost_usd metric (#325)
- Evaluation Stage Costs: LLM costs tracked for both sync and batch modes (#324)
- Generation Stage Costs: LLM costs tracked during scenario generation (#323)
- Cost Utility:
calculateCostFromUsageutility for SDK message responses (#322) - CLI Documentation: Added CLI reference and improved help output (#315)
Changed
- Updated Anthropic tooling versions
- Removed redundant top-level cost fields from EvaluationOutput (#328)
- Upgraded Claude Code Action workflows to Opus 4.5
Fixed
- YAML null values normalized to undefined before config validation (#339)
- Execution cost estimation formula corrected (#332)
- Plugin load costs included in evaluation metrics total
- Plugin load API costs tracked in execution metrics (#331)
- All stage costs tracked in E2E tests
- Config files aligned with schema model defaults
Full Changelog: v0.3.0...v0.4.0