v0.2: Make DATA-01 decision-grade#6
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1db9638f39
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if "max_words" in report_config: | ||
| count = word_count(text) | ||
| maximum = int(report_config["max_words"]) | ||
| checks.append(check("report_max_words", count <= maximum, 0.04, f"got={count}")) |
There was a problem hiding this comment.
Enforce report word-limit failures as policy violations
When report_max_words fails, the scorer only records a failed check and does not add a policy violation or score cap, so overlong reports can still pass. For example, with otherwise-correct artifacts, a 500-word report in case_001 still returns success=True because the score remains above PASS_THRESHOLD and success is gated on not violations. This makes the configured max_words constraint effectively non-blocking and undermines the task’s “concise report” requirement.
Useful? React with 👍 / 👎.
1db9638 to
eae65f7
Compare
Closes #3.
Summary:
Test plan: