Skip to content

[low] Index/migration integrity: contents immutability trigger, v14 size guard, stat-after-hash #113

Description

@mbertschler

Summary

Schema-level and migration-level defenses that the append-only guarantee implies but that currently rely on code discipline or are unguarded against pathological legacy data.

12a — contents immutability is code-only post-v14 (LOW, schema defense)

The v14 reshape dropped the files_blake3_immutable trigger and added nothing equivalent for contents. UNIQUE(blake3) prevents duplicate hashes but not an in-place UPDATE of blake3/size_bytes/origin_* by a future bug. Fix: BEFORE UPDATE/BEFORE DELETE ON contents triggers raising ABORT (one additive migration), restoring the schema-level guarantee the README implies. (Auditor B confirmed there is no UPDATE/DELETE path on contents today — this is defense in depth.)

12b — v13→v14 contents backfill lacks a consistency guard (LOW)

store/migrations.go — when two v13 rows share a blake3 with different size_bytes (only reachable via prior corruption or the indexer stat/hash TOCTOU), the migration silently coalesces to the earliest observation's size. No row is dropped and no constraint is violated, but the conflict is swallowed. Fix: a one-statement guard before the contents seed (GROUP BY blake3 HAVING COUNT(DISTINCT size_bytes) > 1 → refuse) turns corruption into a loud pre-migration failure; add a same-hash-different-size fixture asserting the refusal.

12c — Migration test corpus gaps (LOW)

The migration tests don't exercise: same-hash-different-size v13 rows (12b); a v17 rebuild with populated remote_objects; status_changed_run_id backfill values on legacy rows (no assertions anywhere); orphaned-FK and duplicate-live-row shapes (handled by loud rollback, but untested). Add fixtures so these legacy shapes are covered.

12d — Indexer stat-after-hash (see also the upload-integrity issue)

Build the contents row from a Stat of the open handle after hashing, so a file appended-to mid-index cannot mint an immutable row whose size disagrees with its hash. (Cross-referenced from the content-addressed upload-integrity issue; fixing it here closes the migration-corruption source.)

Adversarial audit of offload-v1 (auditor B MEDIUM-7, LOW-10, untested-shapes list).

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions