feat(groomer): schema-constrained LLM output (json_schema) by joryirving · Pull Request #516 · misospace/dispatch

joryirving · 2026-07-01T21:23:03Z

Summary

Upgrade the groomer's response_format from {type:"json_object"} (valid JSON, unconstrained shape) to {type:"json_schema"} with a JSON Schema for the groomer output. On the self-hosted llama.cpp backend (via litellm, which forwards response_format), this grammar-constrains decoding to the exact shape — the key to reliable output from a small model.
lane.id is a dynamic enum built from the configured lanes (getLaneIds()), so the model can only emit a real lane.
Graceful fallback: on a 400 (backend without json_schema support) it retries with {type:"json_object"}, so grooming never breaks. validateGroomerOutput still runs as the safety net (and handles enum alias canonicalization) — belt and suspenders.

Why

Enables running the groomer on a small local model (e.g. the 4B vision model) reliably: the schema/grammar enforces structure that a 4B otherwise gets wrong, while the existing coercion layer handles the rest.

Verification

vitest run src/lib/groomer/ → 190 passing (3 new: schema shape + dynamic lane enum, json_schema sent first, 400→json_object fallback).
tsc --noEmit → 0; eslint → clean.

Live validation to do after merge

Confirm litellm forwards json_schema through to llama.cpp: dry-run a groom against the vision model and check the output is schema-valid. (Safe regardless — the fallback + validator mean worst case is today's behavior.)

Related: #498

The groomer used response_format {type:"json_object"} — valid JSON, but not schema-conformant, so a model could emit wrong/missing fields. Switch to {type:"json_schema"} with a JSON Schema for the groomer output, so a llama.cpp backend (via litellm) grammar-constrains decoding to the exact shape. This is what lets a small model (e.g. the 4B vision model) produce reliable structure. - buildGroomerResponseSchema() constrains structure + fixed enums, and builds lane.id as a dynamic enum from the configured lanes (getLaneIds) so the model can't emit a hallucinated lane. - Falls back to json_object on a 400 (backend without json_schema support), so grooming never breaks. validateGroomerOutput still runs as the safety net and handles enum alias canonicalization — belt and suspenders. Related: #498 (dispatch audit). Enables running the groomer on a small local model.

Grooming is triage/classification, not code — a small model is plenty. Point DISPATCH_GROOMER_MODEL at `vision` (Qwen3.5-4B / llama-vision via litellm), freeing the bigger model for the coder/reviewer. dispatch's json_schema response_format (misospace/dispatch#516) grammar-constrains the 4B's output so structure stays reliable.

its-saffron

AI Automated Review

Full PR review.

Analysis engine: MiniMax-M2.7@https://litellm.jory.dev/v1 (anthropic) — escalated (fast_low_confidence)

Recommendation: Approve

This PR cleanly upgrades the groomer's LLM output from an unconstrained {type:"json_object"} to a schema-constrained {type:"json_schema"} format with a dynamic lane enum, while maintaining backward compatibility through a graceful fallback.

Change-by-Change Findings

`src/lib/groomer/llm.ts`

buildGroomerResponseSchema() (new function, lines 67–131)

Builds a strict JSON Schema with additionalProperties: false to enforce exact shape
lane.id is a dynamic enum populated from getLaneIds(), ensuring the model can only emit configured lanes — eliminates hallucinated lane IDs
confidence is an enum of ["high","medium","low"] shared across lane.confidence and the top-level field
Required fields are labelsToAdd, labelsToRemove, lane — consistent with the existing GroomerOutput schema contract
Belt-and-suspenders approach with validateGroomerOutput still running post-decode

postChatCompletion() (new helper, lines 133–149)

Extracted from callGroomerLLM to enable the retry pattern without duplicating the fetch call
Correctly passes AbortSignal for timeout cancellation in both the initial call and fallback
Temperature fixed at 0.1 — unchanged from previous behavior

callGroomerLLM() (modified, lines 151–169)

Attempts {type:"json_schema", json_schema: {...}} first, as intended
On response.status === 400 it retries with {type:"json_object"} — the documented graceful fallback
The controller.signal is correctly threaded into both postChatCompletion calls, so timeout cancellation is preserved
The timeoutId is still cleared in the finally block (visible in the diff's continuation)

`src/lib/groomer/llm.test.ts`

Existing test assertion (line 34): Changed from expect(capturedBody.response_format?.type).toBe("json_object") → json_schema — correctly reflects the new default behavior.

buildGroomerResponseSchema test (lines 191–201):

Verifies required fields, additionalProperties: false, and lane.id enum matches getLaneIds()
Checks laneId.enum.length > 0 to guard against empty lane configs

callGroomerLLM response_format tests (lines 203–230):

Test 1 verifies the first call sends json_schema with the correct schema name and dynamic lane enum
Test 2 verifies the 400→json_object fallback: mock returns 400 first, then 200, and asserts both call bodies and a successful parsed result

Standards Compliance

Convention	Status
Uses `getLaneIds()` from `@/lib/lane-config` for lane configuration	✅ Verified via `git_grep` hits
No new secrets or environment variables	✅ No secrets in diff
Error handling with `error instanceof Error` pattern	✅ `if (!response.ok)` with text extraction
API returns appropriate HTTP status codes	✅ 400 triggers fallback, not error
TypeScript types via Prisma/schema	✅ Uses existing `GroomerOutput` type
No hardcoded agent names in generic docs	✅ N/A — library code

Evidence Provider Findings

No evidence providers configured. No findings.

Tool Harness Findings

Tool	Finding	Resolution
`getLaneIds` usage scan	Used correctly in `llm.ts` (imported, called at line 76, result used in schema) and test (imported, used in assertions)	✅ Verified
`lane-config.ts` read	Shows `getLaneIds()` at line 150 returning an array of configured lane IDs; default config has `default` and `backlog`	✅ Consistent with test expectations
`llm.ts` full read	Confirms `buildGroomerResponseSchema` is exported and `callGroomerLLM` uses `postChatCompletion` with signal threading	✅ Verified

Linked Issue Fit

No linked issue references found in the PR body. The PR references PR 498 as "Related" but it is not linked as an acceptance-criteria source. This review cannot assess requirement fulfillment against PR 498.

CI Check Results

All checks passed for commit b1eacda6:

Docker Build ✅
Typecheck ✅
Build ✅
Tests ✅
Lint ✅

Unknowns / Needs Verification

Live validation (post-merge): The PR body states a manual verification is needed after merge — confirm litellm forwards json_schema through to llama.cpp. This is expected and documented; no blocker.
Issue PR 498 acceptance criteria: Referenced as "Related" but not linked as a formal requirement source. No formal acceptance criteria to verify.

Findings

No findings to report.

joryirving mentioned this pull request Jul 1, 2026

feat(llm): groom with the 4B vision model joryirving/home-ops#7968

Merged

This comment was marked as outdated.

Sign in to view

joryirving enabled auto-merge (squash) July 1, 2026 21:38

Merge branch 'main' into feat/groomer-structured-output

b1eacda

joryirving merged commit 8b50c25 into main Jul 1, 2026
6 checks passed

joryirving deleted the feat/groomer-structured-output branch July 1, 2026 21:43

its-saffron Bot approved these changes Jul 1, 2026

View reviewed changes

its-saffron Bot mentioned this pull request Jul 2, 2026

chore(release): bump version to 0.5.16 #538

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(groomer): schema-constrained LLM output (json_schema)#516

feat(groomer): schema-constrained LLM output (json_schema)#516
joryirving merged 2 commits into
mainfrom
feat/groomer-structured-output

joryirving commented Jul 1, 2026

Uh oh!

This comment was marked as outdated.

Uh oh!

Uh oh!

its-saffron Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

joryirving commented Jul 1, 2026

Summary

Why

Verification

Live validation to do after merge

Uh oh!

This comment was marked as outdated.

Uh oh!

Uh oh!

its-saffron Bot left a comment

Choose a reason for hiding this comment

AI Automated Review

Recommendation: Approve

Change-by-Change Findings

src/lib/groomer/llm.ts

src/lib/groomer/llm.test.ts

Standards Compliance

Evidence Provider Findings

Tool Harness Findings

Linked Issue Fit

CI Check Results

Unknowns / Needs Verification

Findings

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

`src/lib/groomer/llm.ts`

`src/lib/groomer/llm.test.ts`