fix(groomer): grammar-enforce label allowlist and proposedTitle bounds#542
Conversation
The json_schema was looser than validateGroomerOutput, so a grammar-constrained 4B could still emit output the validator rejects — both seen in prod: labelsToAdd '[type/refactor]' (invented label) and proposedTitle '' (emitted empty instead of omitting). Constrain label items to the exported ALLOWED_GROOMER_LABELS enum and bound proposedTitle (10-200 chars or null) / proposedBody (<10k or null) in the grammar itself.
There was a problem hiding this comment.
AI Automated Review
Full PR review.
Analysis engine: MiniMax-M2.7@https://litellm.jory.dev/v1 (anthropic) — routed smart (risk match: db_or_migration_changes)
PR Review: fix(groomer): grammar-enforce label allowlist and proposedTitle bounds
Summary
This PR tightens the JSON schema emitted by buildGroomerResponseSchema so that small LLMs cannot produce output that fails downstream validation. It adds enum constraints to labelsToAdd/labelsToRemove and length bounds to proposedTitle/proposedBody.
Required Checks
-
review migration for data loss risk: ✅ NOT APPLICABLE — This PR contains no database migration. The changed files are all TypeScript sources in
src/lib/groomer/(llm.ts,llm.test.ts,schema.ts). There is noprisma/schema.prismachange, no migration file, and no Prisma operations. Thedb_or_migration_changesclassification appears to be a misclassification; the PR is a pure code/schema change. -
test migration on a copy of production schema: ✅ NOT APPLICABLE — No migration exists to test. All CI checks passed (Build, Typecheck, Lint, Tests: 194 passed).
Change-by-Change Findings
| File | Change | Assessment |
|---|---|---|
src/lib/groomer/schema.ts |
Exports ALLOWED_GROOMER_LABELS (a readonly array of status + priority + type labels). Refactors isAllowedLabel to use .includes() on that array. |
Clean refactor; centralizes the allowlist in one place. |
src/lib/groomer/llm.ts |
Enum-constrains labelsToAdd/labelsToRemove items to ALLOWED_GROOMER_LABELS; adds anyOf bounds to proposedTitle (null or 10–200 chars) and proposedBody (null or ≤9999 chars). |
Addresses both prod failures described in the PR body: the "type/refactor" invented label and the empty-string proposedTitle. |
src/lib/groomer/llm.test.ts |
Adds two new test cases verifying enum constraint and title/body bounds. | Coverage is sufficient; verifies the exact failure mode seen in prod. |
Standards Compliance
- Code Standards (AGENTS.md): ✅ No agent-specific names in docs; ✅ Prisma schema not touched; ✅ no commit of secrets. The changes follow the existing pattern of schema-constrained LLM output introduced in PR PR 516.
- Error handling: ✅ Uses existing
validateGroomerOutput+ 400→json_object fallback pattern mentioned in the PR body. - The PR follows the repository's established convention of using
json_schema(withanyOf/enum) for LLM output validation.
Linked Issue Fit
No linked issue was found in the PR body. The PR is self-contained and directly addresses two described production failures without needing external issue context.
Evidence Provider Findings
No evidence providers were configured; none to evaluate.
Tool Harness Findings
No tool harness calls were issued; corpus reviewed directly.
Unknowns / Needs Verification
- The
db_or_migration_changesclassification is inconsistent with the diff: no migration files exist in this PR. The classifier may have triggered on the word "schema" in file names (schema.ts,buildGroomerResponseSchema). This does not block approval but is worth noting. - The
maxLength: 9999bound onproposedBodywas not explicitly called out in the PR body. The validator's actual upper bound is not shown in the diff, but the test confirms the expected value. Assuming this matchesvalidateGroomerOutput(the PR states "grammar was looser thanvalidateGroomerOutput; now the model physically can't emit either"), this is consistent.
Recommendation
Approve. The changes are internally consistent, all CI checks pass, tests cover the new constraints, and the approach directly fixes the two production failures described in the PR body. No database or migration impact exists.
Summary
buildGroomerResponseSchemanow enum-constrainslabelsToAdd/labelsToRemoveitems to the validator's allowlist (newly exportedALLOWED_GROOMER_LABELS) and boundsproposedTitle(nullor 10–200 chars) /proposedBody(nullor <10k).Fixes the two prod groom failures on the vision 4B:
labelsToAdd contains disallowed label: type/refactor(invented label) andproposedTitle must be between 10 and 200 characters, got 0(empty string instead of omitting). The grammar was looser thanvalidateGroomerOutput; now the model physically can't emit either. llama.cpp supports enum/anyOf/min-maxLength in json_schema; if a backend rejects the schema, the existing 400→json_object fallback still applies.Verification
vitest run src/lib/groomer/→ 194 passed (+2);tsc+eslintclean.