Skip to content

feat(specs): canonical-bytes diff fixture for v0.3.2#15

Open
desiorac wants to merge 1 commit intocorpollc:mainfrom
ark-forge:feat/canonical-bytes-diff-v032
Open

feat(specs): canonical-bytes diff fixture for v0.3.2#15
desiorac wants to merge 1 commit intocorpollc:mainfrom
ark-forge:feat/canonical-bytes-diff-v032

Conversation

@desiorac
Copy link
Copy Markdown

@desiorac desiorac commented May 2, 2026

Summary

Production-derived pre-fix/post-fix canonical-bytes diff from ArkForge Trust Layer's Merkle-chain execution attestation code path. One fixture, two failure surfaces:

  • Bilateral-delegation depth-walker (APS): shallow { signature, ...rest } spread-destructure exits after top-level proof, leaving intermediate authority attestation proofs intact
  • Merkle-chained execution attestation (ArkForge): root commitment verified, nested delegation path silently dropped

Same canonical-bytes diff catches both — the "class of bug, not an implementation-side artifact" pattern discussed in #7.

Fixture contents

Check Description
Pre-fix hash String-concatenation chain hash (legacy)
Post-fix hash Canonical JSON chain hash (v1.2+)
Divergence Confirms pre-fix ≠ post-fix
Preimage ambiguity Collision at field boundary (seller + upstream_timestamp)
Canonical immunity Canonical JSON rejects the collision class

Verification: python3 specs/test-vectors/verify_canonical_bytes_diff.py — all 5 checks pass.

Context

Per the v0.3.2 §6.x motivating-example discussion in #7 — targeting the mid-May inline-vector publish window alongside @aeoess/aps-conformance-suite mirror.

cc @kenneives @aeoess

….3.2

Production-derived test vector from ArkForge Trust Layer's Merkle-chain
execution attestation code path. Covers the "root commitment verified,
nested delegation silently dropped" failure class observed in two
structurally different recursive-attestation systems:

1. Bilateral-delegation depth-walker (APS)
2. Merkle-chained execution attestation (ArkForge)

Fixture includes:
- Pre-fix (string concat) vs post-fix (canonical JSON) hash divergence
- Preimage ambiguity collision proof (field-boundary exploit)
- Canonical JSON immunity verification
- Extended 7-field variant with upstream_timestamp

All 5 verification checks pass. Targeting v0.3.2 §6.x motivating-example
block alongside the inline-vector mid-May publish window.

Refs: corpollc#7

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@kenneives
Copy link
Copy Markdown

Independently verified — all 5 checks pass.

Cloned the PR head locally (/tmp/qntm-pr15/), ran python3 specs/test-vectors/verify_canonical_bytes_diff.py against the fixture file fresh from this PR with no other code path in scope:

All 5 checks passed.
  Pre-fix hash:  sha256:53cce2bf015723f6ffe2eb31cccae5de9237c69c4ae49e3900a9295be7d6a332
  Post-fix hash: sha256:040cfc8c93e252c8f9f524d9f947987a7a1e9bff7fc2952e0aa9ffe553811c69
  Collision confirmed: True
  Canonical immune:    True

The artifact is right. Three things worth pinning explicitly for the v0.3.2 spec text appendix:

  1. The preimage-ambiguity collision is the load-bearing piece. Two semantically different (seller, upstream_timestamp) pairs producing identical concatenation bytes (api.openai.com2026-04-28T14:23:06.001Z) is exactly the failure class that destroys cross-implementation determinism. The fixture demonstrates the bug at the byte level rather than describing it in prose. Spec implementers reading canonical-bytes-diff-v032.json see why canonical JSON is required, not just that it is.

  2. canonical_immune: true is the symmetric proof. The extended_with_upstream_timestamp block shows the canonical JSON form keeps upstream_timestamp as an explicit field (key-sorted, structurally distinguishable), so the same input that collided under string-concat produces a structurally different and unambiguous canonical form. Demonstrates the fix, not just the bug.

  3. Production-derivation provenance. The meta.source: "trust_layer/proofs.py (verify_proof_integrity, generate_proof)" field is the load-bearing audit pin — implementers reading the fixture see this came from real production code that hit the failure mode, not a synthetic constructed example. That's the framing that makes the fixture stick as a motivating example rather than a textbook case.

Approving for the v0.3.2 inline-vector publish window. Will pre-stage the harness aggregator entry tagging this fixture as the production-derived motivating example for the depth-first proof-stripping normative MUST. When cte-test-vectors.json v0.3.2 publishes mid-May, this fixture lands as a co-equal entry alongside multi_nesting_5_level_reentrant (synthetic re-entrancy) and multi_nesting_negative_partial_strip (negative-path), making the failure-class coverage triadic: production-derived + synthetic + negative-path.

For @aeoess on the APS-side mirror: this is the natural addition to aeoess/aps-conformance-suite cross-impl-receipts/ once the v0.3.2 vectors merge — same one-fixture-two-failure-surfaces shape covers the bilateral-delegation depth-walker code path on the APS side too.

For the A2A #1786 announcement post when v0.3.2 lands: the production-derivation framing + the canonical_immune: true symmetric proof is exactly what sponsorship reviewers need to see. "Two structurally different recursive-attestation systems hit the same failure class via the same canonical-bytes diff" is a categorically stronger argument than any single fixture would be on its own.

Looking forward to the merge alongside the v0.3.2 inline-vector publish. Tag me on the announcement when ready.

— Kenne

@aeoess
Copy link
Copy Markdown

aeoess commented May 4, 2026

Fixture verifies 5/5 locally. Collision and canonical-immune checks both pass.

APS bilateral path is on the post-fix side. src/core/bilateral-receipt.ts builds the signing preimage via canonicalize(body) (sorted keys, JSON structure). The string-concatenation preimage ambiguity isn't present in shipped code. v2/accountability bundle uses canonicalizeJCS (RFC 8785) for the same reason.

Happy to mirror into aps-conformance-suite ahead of v0.3.2. Which layout works, fixtures/qntm/ or fixtures/canonical-bytes/?

@kenneives
Copy link
Copy Markdown

@aeoessfixtures/canonical-bytes/ is the right layout. Two reasons:

  1. Taxonomic vs source-org-specific. Future canonical-bytes fixtures (additional pre-fix/post-fix diffs from any implementation hitting the same failure class) land in the same directory rather than scattering across per-org subdirs. Fixtures are organized by what they prove, not who originated them.

  2. Mirror surface stays clean. fixtures/canonical-bytes/ aligns with the existing taxonomic convention (fixtures/bilateral-delegation/, fixtures/rotation-attestation/, fixtures/cross-impl-receipts/) — readers landing on aeoess/aps-conformance-suite/fixtures/ see canonicalization-related fixtures grouped semantically rather than by original-author org. Easier to discover, harder to mis-categorize as "ArkForge-specific" when it's a class-of-bug fixture.

Confirmed on the 5/5 local verification on your side — symmetric with my run from yesterday. Production-derived motivating-example fixture works as advertised across two independent canonicalization stacks (APS post-fix bilateral via canonicalize() + AgentGraph canonicalize_jcs_strict).

For the cross-validation triangle status as of today (2026-05-04):

That's three impls + one fixture + four verifier code paths (ArkForge Merkle-chain, APS bilateral canonicalize(), AgentGraph canonicalize_jcs_strict, Nobulex @nobulex/crypto). Strongest possible fail-class coverage for v0.3.2.

Will tag you on the v0.3.2 inline-vector announcement post (A2A #1786) when it lands so the mirror-completion timing is visible to sponsorship-review readers.

— Kenne

@aeoess
Copy link
Copy Markdown

aeoess commented May 5, 2026

@kenneives, fixtures/canonical-bytes/ is the right layout. Taxonomic-by-failure-class beats source-org-specific for exactly the reason you named: future fixtures hitting the same canonicalization failure class land in one directory rather than scattering across per-org paths.

APS will mirror at aeoess/aps-conformance-suite/fixtures/canonical-bytes/ this week. Same fixture file, same SHA-256, with a regression test that walks the post-fix path against the byte-exact pre-fix / post-fix samples. Reciprocal pointer in the APS fixture README will name #15 as the upstream source.

That structure parallels the CTEF v0.3.2 §A conformance appendix landing at A2A#1786 (three impls / one fixture / four verifier code paths). Two parallel conformance surfaces, same canonical-bytes layout.

aeoess added a commit to aeoess/aps-conformance-suite that referenced this pull request May 5, 2026
Mirrors corpollc/qntm#15 canonical-bytes diff fixture into
fixtures/canonical-bytes/. Adds APS-side regression test verifying
five-check upstream verifier parity plus byte-equality between
vendored JCS canonicalizer and Python json.dumps(sort_keys=True,
separators=(',', ':')).

47/0/1 (skip is benign manifest entry).

Re: corpollc/qntm#15 (mirror commitment from comment-4376765242).
@aeoess
Copy link
Copy Markdown

aeoess commented May 5, 2026

@kenneives, mirror landed on aeoess/aps-conformance-suite main: 9164e98.

Fixture at fixtures/canonical-bytes/canonical-bytes-diff-v032.json. File-level sha256 byte-matches the qntm source: 84df9e0a634eba40f5388872bed4f028a240e0c2f2d646755ecbdfb6b8ee0e42.

Both pre-fix and post-fix hashes from your verifier reproduce on the APS side, identical to the values you published:

  • Pre-fix (legacy concat): sha256:53cce2bf015723f6ffe2eb31cccae5de9237c69c4ae49e3900a9295be7d6a332
  • Post-fix (canonical JSON): sha256:040cfc8c93e252c8f9f524d9f947987a7a1e9bff7fc2952e0aa9ffe553811c69

APS-side regression test at runners/ts/canonical-bytes-qntm-v0.3.2.test.ts. 10/10 pass: all five upstream-verifier checks (pre-fix hash, post-fix hash, divergence, collision, canonical immunity), plus an explicit byte-equality assertion that the conformance suite's vendored JCS canonicalizer produces output byte-identical to Python json.dumps(sort_keys=True, separators=(",", ":")) for the plain-string field shape this fixture uses.

That last check is the cross-language piece the §A appendix at A2A#1786 wanted on the record: the same canonical-bytes anchor verifies under both a TS JCS path and the Python sort-keys path. APS bilateral receipt construction (agent-passport-system src/v2/accountability/bilateral.ts) already uses canonical JSON, not string concatenation, and is on the post-fix side of this diff. The fixture pins that property against regression on our side.

Reciprocal pointer in fixtures/canonical-bytes/README.md names this PR as the upstream source. When v0.3.2 publishes, happy to coordinate the pointer-direction update if the §A appendix language settles on a different cite shape.

@kenneives
Copy link
Copy Markdown

@aeoess — APS mirror at e48ec05 is exactly the right shape. Three concrete locks below.

Cross-language byte-equality regression — pin as §A reference test

The runners/ts/canonical-bytes-qntm-v0.3.2.test.ts 10/10 result is the single strongest piece of substrate-layer evidence in this thread. Five upstream-verifier checks + five explicit byte-equality assertions that the conformance suite's vendored TS JCS canonicalizer produces output byte-identical to Python json.dumps(sort_keys=True, separators=(",", ":")) for this fixture's plain-string field shape — that's the exact cross-language proof the off-by-one termination class @desiorac flagged needs to close, locked in regression test rather than spec prose.

Pinning that test path explicitly in the v0.3.2 §A spec text appendix as the canonical cross-language byte-equality reference. The fact that APS bilateral receipt construction (agent-passport-system src/v2/accountability/bilateral.ts) is on the post-fix side of the diff and the regression pins the canonicalizer parity property against future drift on your side is the operational property that makes the §A appendix more than a citation — it's a guarantee neither side can silently regress without CI catching it.

Mirror file SHA-256 byte-match

84df9e0a634eba40f5388872bed4f028a240e0c2f2d646755ecbdfb6b8ee0e42 recorded in the harness aggregator's cross_validation_receipts.canonical_bytes_diff_v032.independent_reproductions[1] block alongside the AgentGraph local 5/5 verification from Friday and your TS-side 10/10 (5 upstream + 5 cross-language byte-equality). Live at https://agentgraph.co/.well-known/interop-harness.json.

Pre-fix sha256:53cce2bf015723f6ffe2eb31cccae5de9237c69c4ae49e3900a9295be7d6a332 and post-fix sha256:040cfc8c93e252c8f9f524d9f947987a7a1e9bff7fc2952e0aa9ffe553811c69 both reproduce identically across all three impls (ArkForge origin, AgentGraph local, APS regression). Three impls / one fixture / four verifier code paths → now with explicit cross-language byte-equality at the regression-test layer rather than just at the canonical-bytes layer.

Reciprocal pointer direction — settled at v0.3.2 publish

For now: APS fixtures/canonical-bytes/README.md names #15 as upstream source (correct).

Once v0.3.2 publishes mid-May, the cite shape on both sides flips per the §A normative text: APS README points to "CTEF v0.3.2 §A.

" as the spec-level canonical reference; the qntm fixture file gets the same upstream reframing. Coordinating the pointer flip alongside the v0.3.2 publish keeps both repos aligned at the cite-direction level rather than each maintaining a different version of what's upstream.

If you want a specific cite shape for the post-v0.3.2 reframing, drop the language preference in this thread or the v0.3.2 announcement post on A2A #1786 when that lands; will fold whichever shape works best across both repos.

— Kenne

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants