Skip to content

fix(serializer): preserve canonical bullet and rule markers on round-trip#48

Open
ngmeyer wants to merge 1 commit into
EveryInc:mainfrom
ngmeyer:fix/markdown-serializer-fidelity
Open

fix(serializer): preserve canonical bullet and rule markers on round-trip#48
ngmeyer wants to merge 1 commit into
EveryInc:mainfrom
ngmeyer:fix/markdown-serializer-fidelity

Conversation

@ngmeyer

@ngmeyer ngmeyer commented May 17, 2026

Copy link
Copy Markdown

Summary

The headless and browser markdown serializers both call remark-stringify with only a handlers option, accepting all defaults. Those defaults emit * for bullets and *** for thematic breaks. Any document whose source uses the - / --- convention — the dominant style in CommonMark and GFM ecosystems (prettier, markdownlint, GitHub) — drifts cosmetically on every parse → serialize cycle.

For HITL workflows where Proof is the review surface and a local markdown file is the source of truth, this means every sync rewrites the canonical style across the entire doc, burying real edits in noise and breaking any CI markdown linter on the post-sync output.

Change

Two config sites, identical fix:

  • server/milkdown-headless.ts — passed to remark-stringify in createSerializer(). This is the path that produces the markdown returned from GET /api/agent/:slug/state.
  • src/editor/index.ts — applied via remarkStringifyOptionsCtx on the browser editor, so client-side serialization (e.g., for outgoing edits / snapshots) matches.
-    .use(remarkStringify, {
+    .use(remarkStringify, {
+      bullet: '-',
+      rule: '-',
       handlers: { proofMark: proofMarkHandler },
     });

Test coverage

  • New: src/tests/milkdown-headless-serializer-style-fidelity.test.ts asserts that - bullets and --- thematic breaks survive serialize(parse(source)), and verifies the existing stability property (serialize(parse(x)) === serialize(parse(serialize(parse(x))))) is preserved.
  • Existing milkdown-headless-serializer-roundtrip.test.ts still passes — no regression in stability.
  • Full npm test suite (74 tests including share-route, agent presence, comment/suggestion/rewrite ops, rate limiting) passes with zero failures.

How to verify locally

npx tsx src/tests/milkdown-headless-serializer-style-fidelity.test.ts
npx tsx src/tests/milkdown-headless-serializer-roundtrip.test.ts
npm test

Or end-to-end against a running server:

npm run serve &
SLUG=$(curl -s -X POST http://localhost:4000/share/markdown \
  -H 'Content-Type: application/json' \
  -d '{"title":"t","markdown":"- a\n- b\n\n---\n\n- c\n"}' | jq -r .slug)
TOKEN=$(curl -s -X POST http://localhost:4000/share/markdown \
  -H 'Content-Type: application/json' \
  -d '{"title":"t","markdown":"- a"}' | jq -r .accessToken)
# Read back — bullets should be '-' not '*', rule should be '---' not '***'.
curl -s "http://localhost:4000/api/agent/$SLUG/state" \
  -H "Authorization: Bearer $TOKEN" | jq -r .markdown

I verified this end-to-end on my fork before submitting: canonical - / --- source survived both pure upload→read and upload→suggestion.add status:"accepted"→read.

Known limitation (out of scope, separate PR)

Tight lists round-trip as loose — input - a\n- b\n returns as - a\n\n- b\n. The mechanism is different: it's the mdast spread property on list / listItem nodes being set during ProseMirror ↔ AST conversion, not a remark-stringify option. Fixing it would require touching the Milkdown transformer / schema's toMarkdown paths. I think it's worth a separate PR rather than expanding scope here — happy to take a swing at it next if there's interest.

Why now

I'm building a small Projects-layer skill on top of ce-proof for managing collections of Proof docs across multiple repos, and the HITL round-trip dogfood test I ran on a real plan surfaced this drift. The fix is small and clean enough that it felt worth contributing back rather than carrying as a fork patch.

…trip

Without explicit options, remark-stringify defaults to `*` for bullets
and `***` for thematic breaks, producing cosmetic markdown drift on
every parse -> serialize cycle. This breaks workflows that round-trip
real documents (e.g., HITL review where the local file is the source
of truth and Proof is the review surface) -- every sync rewrites
canonical style across the doc, burying real edits in noise.

Set { bullet: '-', rule: '-' } on both serializer sites:
- server/milkdown-headless.ts (headless serializer used by /state)
- src/editor/index.ts (browser editor)

Add a fidelity test alongside the existing roundtrip parity test
to assert the property and guard against regression.

Known limitation, not addressed here: tight lists round-trip as loose
(blank lines inserted between items). That's an mdast `spread` property
issue in the ProseMirror <-> AST conversion, not a remark-stringify
option. Separate PR.
maphilipps added a commit to maphilipps/proof-sdk that referenced this pull request May 24, 2026
- isAllowedImageType: MIME-Whitelist PNG/JPEG/WEBP (kein SVG, XSS-Risiko)
- buildAssetSrc: normalisiert Asset-Pfad fuer AC-konforme Markdown-Speicherung
- classifyUploadError: HTTP-Status → typisierter Fehler (415/413/401/Unknown)
- createAssetUploader: Milkdown Uploader-Factory fuer @milkdown/plugin-upload
  Ruft POST /documents/{slug}/assets/upload mit Raw-Bytes auf
  Gibt image-Nodes mit relativem assets/{uuid}.ext zurueck
- adproof-plugins.ts: upload-Plugin in adProofAffordancePlugins eingebunden
  createAdProofAffordanceConfig akzeptiert { slug, getToken } fuer Upload-Wiring
- package.json: @milkdown/plugin-upload ^7.21.1 explizit als Dependency

Closes: EveryInc#48 (pure helpers + plugin wiring; Drag-Drop/Slash-UI via EveryInc#50 HITL)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant