feat(groomer): schema-constrained LLM output (json_schema)#516
Conversation
The groomer used response_format {type:"json_object"} — valid JSON, but not
schema-conformant, so a model could emit wrong/missing fields. Switch to
{type:"json_schema"} with a JSON Schema for the groomer output, so a
llama.cpp backend (via litellm) grammar-constrains decoding to the exact
shape. This is what lets a small model (e.g. the 4B vision model) produce
reliable structure.
- buildGroomerResponseSchema() constrains structure + fixed enums, and builds
lane.id as a dynamic enum from the configured lanes (getLaneIds) so the
model can't emit a hallucinated lane.
- Falls back to json_object on a 400 (backend without json_schema support), so
grooming never breaks. validateGroomerOutput still runs as the safety net and
handles enum alias canonicalization — belt and suspenders.
Related: #498 (dispatch audit). Enables running the groomer on a small local model.
Grooming is triage/classification, not code — a small model is plenty. Point DISPATCH_GROOMER_MODEL at `vision` (Qwen3.5-4B / llama-vision via litellm), freeing the bigger model for the coder/reviewer. dispatch's json_schema response_format (misospace/dispatch#516) grammar-constrains the 4B's output so structure stays reliable.
There was a problem hiding this comment.
AI Automated Review
Full PR review.
Analysis engine: MiniMax-M2.7@https://litellm.jory.dev/v1 (anthropic) — escalated (fast_low_confidence)
Recommendation: Approve
This PR cleanly upgrades the groomer's LLM output from an unconstrained {type:"json_object"} to a schema-constrained {type:"json_schema"} format with a dynamic lane enum, while maintaining backward compatibility through a graceful fallback.
Change-by-Change Findings
src/lib/groomer/llm.ts
buildGroomerResponseSchema() (new function, lines 67–131)
- Builds a strict JSON Schema with
additionalProperties: falseto enforce exact shape lane.idis a dynamic enum populated fromgetLaneIds(), ensuring the model can only emit configured lanes — eliminates hallucinated lane IDsconfidenceis an enum of["high","medium","low"]shared acrosslane.confidenceand the top-level field- Required fields are
labelsToAdd,labelsToRemove,lane— consistent with the existingGroomerOutputschema contract - Belt-and-suspenders approach with
validateGroomerOutputstill running post-decode
postChatCompletion() (new helper, lines 133–149)
- Extracted from
callGroomerLLMto enable the retry pattern without duplicating the fetch call - Correctly passes
AbortSignalfor timeout cancellation in both the initial call and fallback - Temperature fixed at
0.1— unchanged from previous behavior
callGroomerLLM() (modified, lines 151–169)
- Attempts
{type:"json_schema", json_schema: {...}}first, as intended - On
response.status === 400it retries with{type:"json_object"}— the documented graceful fallback - The
controller.signalis correctly threaded into bothpostChatCompletioncalls, so timeout cancellation is preserved - The
timeoutIdis still cleared in thefinallyblock (visible in the diff's continuation)
src/lib/groomer/llm.test.ts
Existing test assertion (line 34): Changed from expect(capturedBody.response_format?.type).toBe("json_object") → json_schema — correctly reflects the new default behavior.
buildGroomerResponseSchema test (lines 191–201):
- Verifies required fields,
additionalProperties: false, andlane.idenum matchesgetLaneIds() - Checks
laneId.enum.length > 0to guard against empty lane configs
callGroomerLLM response_format tests (lines 203–230):
- Test 1 verifies the first call sends
json_schemawith the correct schema name and dynamic lane enum - Test 2 verifies the 400→
json_objectfallback: mock returns 400 first, then 200, and asserts both call bodies and a successful parsed result
Standards Compliance
| Convention | Status |
|---|---|
Uses getLaneIds() from @/lib/lane-config for lane configuration |
✅ Verified via git_grep hits |
| No new secrets or environment variables | ✅ No secrets in diff |
Error handling with error instanceof Error pattern |
✅ if (!response.ok) with text extraction |
| API returns appropriate HTTP status codes | ✅ 400 triggers fallback, not error |
| TypeScript types via Prisma/schema | ✅ Uses existing GroomerOutput type |
| No hardcoded agent names in generic docs | ✅ N/A — library code |
Evidence Provider Findings
No evidence providers configured. No findings.
Tool Harness Findings
| Tool | Finding | Resolution |
|---|---|---|
getLaneIds usage scan |
Used correctly in llm.ts (imported, called at line 76, result used in schema) and test (imported, used in assertions) |
✅ Verified |
lane-config.ts read |
Shows getLaneIds() at line 150 returning an array of configured lane IDs; default config has default and backlog |
✅ Consistent with test expectations |
llm.ts full read |
Confirms buildGroomerResponseSchema is exported and callGroomerLLM uses postChatCompletion with signal threading |
✅ Verified |
Linked Issue Fit
No linked issue references found in the PR body. The PR references PR 498 as "Related" but it is not linked as an acceptance-criteria source. This review cannot assess requirement fulfillment against PR 498.
CI Check Results
All checks passed for commit b1eacda6:
- Docker Build ✅
- Typecheck ✅
- Build ✅
- Tests ✅
- Lint ✅
Unknowns / Needs Verification
- Live validation (post-merge): The PR body states a manual verification is needed after merge — confirm litellm forwards
json_schemathrough to llama.cpp. This is expected and documented; no blocker. - Issue PR 498 acceptance criteria: Referenced as "Related" but not linked as a formal requirement source. No formal acceptance criteria to verify.
Findings
No findings to report.
Summary
response_formatfrom{type:"json_object"}(valid JSON, unconstrained shape) to{type:"json_schema"}with a JSON Schema for the groomer output. On the self-hosted llama.cpp backend (via litellm, which forwardsresponse_format), this grammar-constrains decoding to the exact shape — the key to reliable output from a small model.lane.idis a dynamic enum built from the configured lanes (getLaneIds()), so the model can only emit a real lane.400(backend withoutjson_schemasupport) it retries with{type:"json_object"}, so grooming never breaks.validateGroomerOutputstill runs as the safety net (and handles enum alias canonicalization) — belt and suspenders.Why
Enables running the groomer on a small local model (e.g. the 4B
visionmodel) reliably: the schema/grammar enforces structure that a 4B otherwise gets wrong, while the existing coercion layer handles the rest.Verification
vitest run src/lib/groomer/→ 190 passing (3 new: schema shape + dynamic lane enum, json_schema sent first, 400→json_object fallback).tsc --noEmit→ 0;eslint→ clean.Live validation to do after merge
Confirm litellm forwards
json_schemathrough to llama.cpp: dry-run a groom against thevisionmodel and check the output is schema-valid. (Safe regardless — the fallback + validator mean worst case is today's behavior.)Related: #498