store: harden v13→v14 backfill + broaden migration integrity corpus (#113)#135
Open
mbertschler wants to merge 1 commit into
Open
store: harden v13→v14 backfill + broaden migration integrity corpus (#113)#135mbertschler wants to merge 1 commit into
mbertschler wants to merge 1 commit into
Conversation
…en migration corpus 12b: refuse the v13→v14 contents seed when two files rows share a blake3 with differing size_bytes — corruption (or a stat/hash TOCTOU) that the seed would otherwise silently coalesce to the earliest observation's size. A BLAKE3 digest is over the bytes, so this shape is impossible from honest indexing; turning it into a loud pre-migration failure lets the operator recover from the pre-migration snapshot. The existing v2→v3 fixture seeded one hash at two differing sizes for convenience; give the two distinct files distinct hashes so the data is physically valid. 12c: add migration-corpus fixtures for previously untested legacy shapes — same-hash-different-size refusal (with a same-size negative control), an orphaned files→runs FK caught by the v13→v14 foreign_key_check, a duplicate live row caught when the rebuild recreates uniq_files_live_per_path, a populated v16 remote_objects table driven through the v16→v17 rebuild, and status_changed_run_id backfill assertions on the migrated v13 rows. 12d: add an index test pinning hashFile's stat-after-hash contract (size pairs with the hashed bytes) and one showing an append between index runs supersedes to a row whose size matches its hash. Refs #113
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Index/migration integrity hardening (issue #113)
Closes #113
Schema- and migration-level defenses for the append-only guarantee. All four
sub-items of #113 are covered; two were already landed on
mainby earlierwork in this stack, so this PR adds the remaining guard + the missing test
corpus around all four.
12a —
contentsimmutability triggers — already onmainThe
contents_no_update/contents_no_deleteBEFORE UPDATE/BEFORE DELETEABORT triggers (and their fresh-baseline +
schema.sqlsnapshot) alreadyshipped as migration v20→v21 (
contentsImmutableTriggers()), withassertContentsTriggersAbortasserting both abort. No new migration wasadded — a v24 duplicating v21 would be redundant.
SchemaVersionstays 23.12b — v13→v14 backfill consistency guard — NEW
createAndSeedContentsV14now runsrefuseSameHashDifferentSizeV14beforethe contents seed: if any blake3 in the old
filestable carries more than onesize_bytes, the migration refuses loudly instead of silently coalescing tothe earliest observation's size. A BLAKE3 digest is over the bytes, so this
shape is only reachable via prior corruption or a stat/hash TOCTOU. The guard
fires only on that genuinely-corrupt shape; valid same-hash-same-size
duplicates still migrate to one
contentsrow (negative control test included).12c — migration test corpus — NEW
files→runsFK → caught by the v13→v14foreign_key_checkuniq_files_live_per_path→ caughtwhen the rebuild recreates that partial unique index
remote_objectsdriven through the v16→v17 table rebuild(the v18 fixture seeded the table but started after that rebuild)
status_changed_run_idbackfill values asserted on the migrated v13 rows(previously unasserted anywhere)
12d — indexer stat-after-hash — already on
main, test addedhashFilealready stats the open handle after hashing (commit "index: pinrow size/mtime to the hashed-handle stat"). Added two index tests: one pinning
the
(digest, size)consistency contract, one showing an append between indexruns supersedes to a row whose size matches its hash.
Notes / decisions for review
(additive, no version bump — the issue's prescribed approach). It only changes
behavior for corrupt input; all existing migration tests pass.
TestMigrateV2ToV3seeded one hash attwo different sizes for convenience. That is exactly the corrupt shape 12b
now rejects, so the two distinct files got distinct hashes. The test's purpose
(v2→v3 run synthesis) is unchanged.
store/schema.sqlis unchanged andTestSchemaSnapshotpasses with no drift.
Invariants
Append-only/immutability is strengthened, never weakened: the guard turns a
silent loss-of-size into a loud refusal; no history is deleted or overwritten.
No regression to the #104/#131 work below.
Gates:
go vet ./...,go test ./...,golangci-lint run(0 issues),TestSchemaSnapshotall green locally.