Skip to content

v0.37.5.0 fix(markdown): YAML-aware NESTED_QUOTES validator (stops flagging valid YAML)#1229

Merged
garrytan merged 5 commits into
masterfrom
garrytan/warsaw-v4
May 21, 2026
Merged

v0.37.5.0 fix(markdown): YAML-aware NESTED_QUOTES validator (stops flagging valid YAML)#1229
garrytan merged 5 commits into
masterfrom
garrytan/warsaw-v4

Conversation

@garrytan
Copy link
Copy Markdown
Owner

@garrytan garrytan commented May 20, 2026

Summary

gbrain doctor stops flagging your tags as broken when they're not.

The validator at src/core/markdown.ts:219 was a syntactic count-of-quotes heuristic that flagged any frontmatter line with 3+ unescaped ". Valid YAML like tags: ["yc", "w2025"] got flagged — one user with a 105K-page brain saw 6,981 NESTED_QUOTES errors, the single largest contributor to frontmatter_integrity health.

The fix keeps the count-of-quotes fast path, then disambiguates suspicious lines with js-yaml.safeLoad(value). Only flag lines that genuinely fail to parse. Genuinely broken titles like title: "Foo "bar" baz" still trigger.

No data rewrite needed — the 6,981 errors disappear on next gbrain doctor because the existing data was already valid YAML; the validator was wrong about it.

Commits

  • fix(markdown): YAML-aware NESTED_QUOTES validator — the validator change + 5 new test cases + js-yaml promoted from transitive to direct dep
  • chore: bump version and changelog (v0.40.0.1) — VERSION + CHANGELOG + TODOS.md follow-up entry
  • chore: retarget version slot v0.40.0.1 -> v0.37.5.0 — same content, different queue slot

How this got here

@garrytan-agents opened #1217 with a one-line fix on the emitter side (switch tag serialization from double-quoted to single-quoted YAML). Outside-voice review during planning (codex) caught that the validator is the actual bug — even with a perfect emitter, the count heuristic flags valid single-quoted scalars like title: 'a: "b" "c"' (4+ unescaped " literal). Closed #1217 in favor of this validator-side fix. Thanks to @garrytan-agents for the 6,981-error signal that exposed the underlying class.

Test Coverage

5 new YAML-aware regression cases in test/markdown-validation.test.ts:

  • flow sequence with quoted tags does NOT trigger (6,981-error regression guard)
  • single-quoted scalar with literal inner double quotes does NOT trigger
  • escaped-as-'' quotes inside flow seq do NOT trigger
  • genuinely broken nested quotes STILL trigger
  • unclosed bracket STILL surfaces NESTED_QUOTES or YAML_PARSE (never silent)

All paths in the new fallback covered:

  • count < 3 fast path (existing tests)
  • count >= 3 && yaml.safeLoad succeeds (3 new tests)
  • count >= 3 && yaml.safeLoad throws (2 new tests)

Targeted: 24 pass / 0 fail on test/markdown-validation.test.ts
Downstream: 156 pass / 0 fail on markdown.test.ts, markdown-validation.test.ts, frontmatter-inference.test.ts, import-file.test.ts, sync-failures.test.ts
Full fast loop: 8040 pass / 0 fail / 0 skip in 394s
Typecheck: clean

Pre-Landing Review

Codex outside-voice review (during planning, see plan file) identified 11 findings; the load-bearing finding (F2) pivoted the approach from "fix emitter" to "fix validator" — that's what this PR ships. All 11 findings either addressed in the design or made moot by the new direction:

  • F1 (js-yaml v3 API mismatch) — avoided; we don't pass v4-only options
  • F2 (validator is dumb) — fixed: that's the whole PR
  • F3 (title regression risk) — avoided; emitter untouched
  • F4 (parseMarkdown signature wrong) — tests use correct (content, undefined, {validate:true})
  • F5 (downstream test misread) — irrelevant; emitter untouched
  • F6 (tests too weak) — new tests use adversarial inputs with 4+ unescaped "
  • F7-F10 (minor) — noted, no impact

No further findings on the diff itself. Validator change is additive (early-continue on the count<3 fast path stays); js-yaml dep is in lockfile already (just promoted to direct).

Plan Completion

All 6 tasks from the plan completed:

  • T1 — Close PR fix(frontmatter): use single quotes for tags to prevent NESTED_QUOTES errors #1217 with replacement pointer (state=CLOSED)
  • T2 — js-yaml + @types/js-yaml direct deps added
  • T3 — Validator at src/core/markdown.ts:219-238 made YAML-aware
  • T4 — 5 adversarial test cases added
  • T5 — Full fast loop green
  • T6 — VERSION bump + CHANGELOG + this PR
  • T7 — Filed as P3 follow-up in TODOS.md: unify emitter quoting with brain-writer.ts:184's single-quote-with-''-escape style (cosmetic only post-fix)

TODOS

Added section "v0.37.5.0 NESTED_QUOTES validator follow-up" with one P3 item (T7 from the plan — cosmetic emitter consistency, not a bug).

Documentation

CLAUDE.md "Key files" section already documents the NESTED_QUOTES validator location and frontmatter_integrity check — no drift introduced by this change. The validator's comment block was rewritten to explain the new logic.

Test plan

  • bun run typecheck clean
  • bun test test/markdown-validation.test.ts — 24/24 pass
  • Full fast loop — 8040/0
  • Post-merge: confirm gbrain doctor --json | jq '.checks[] | select(.name=="frontmatter_integrity") | .breakdown.NESTED_QUOTES' drops toward zero on a previously-affected brain

🤖 Generated with Claude Code

garrytan and others added 3 commits May 20, 2026 07:43
The validator at src/core/markdown.ts:219-238 was a syntactic
count-of-quotes heuristic that flagged any frontmatter line with 3+
unescaped " characters. That heuristic is too dumb: valid YAML flow
sequences like `tags: ["yc", "w2025"]` and single-quoted scalars like
`title: 'a: "b" "c"'` both have 3+ unescaped " by design.

Fix: keep the count fast path, then disambiguate with js-yaml.safeLoad
on the value. Only flag lines that genuinely fail to parse. The
full-frontmatter YAML_PARSE check (check 6) still catches structural
failures.

Closes the 6,981-error class on Garry's 105K-page brain in one ~10
LOC change — existing data on disk was already valid YAML; the
validator was wrong about it. No `gbrain frontmatter generate --fix`
sweep needed.

js-yaml@3.14.2 promoted from transitive (via gray-matter) to direct
dependency. @types/js-yaml@3.12.10 added to devDependencies.

5 new YAML-aware test cases in test/markdown-validation.test.ts:
- flow sequence with quoted tags does NOT trigger (6,981 regression guard)
- single-quoted scalar with literal inner double quotes does NOT trigger
- escaped-as-'' quotes inside flow seq do NOT trigger
- genuinely broken nested quotes STILL trigger
- unclosed bracket STILL surfaces NESTED_QUOTES or YAML_PARSE

Closes PR #1217 — outside-voice (codex) review caught that the bug
was the validator, not the emitter. Original 6,981-error signal from
@garrytan-agents.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Same content, different slot in the version queue. v0.40.0.1 was the
queue allocator's default safe slot (bumped past PR #1128's claimed
0.40.0.0). v0.37.5.0 is a PATCH above #1228's claimed 0.37.4.0 and
sits closer to current master (0.37.1.0) in CHANGELOG ordering.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@garrytan garrytan changed the title v0.40.0.1 fix(markdown): YAML-aware NESTED_QUOTES validator (stops flagging valid YAML) v0.37.5.0 fix(markdown): YAML-aware NESTED_QUOTES validator (stops flagging valid YAML) May 21, 2026
garrytan added 2 commits May 20, 2026 19:33
# Conflicts:
#	CHANGELOG.md
#	TODOS.md
#	VERSION
#	package.json
# Conflicts:
#	CHANGELOG.md
#	TODOS.md
#	VERSION
#	package.json
@garrytan garrytan merged commit 1580c6d into master May 21, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant