test(tethys): cascade-correctness regression fences (rivets-wsix)#75
test(tethys): cascade-correctness regression fences (rivets-wsix)#75dwalleck wants to merge 5 commits into
Conversation
Audit-trail commit for rivets-wsix. Following the gilfoyle workflow, the
diagnostic dir holds the probe scripts, oracle, design, and plan that
preceded the regression-fence work landing in subsequent slices.
Key finding from prove-it-prototype: ZERO new bugs. The schema's
ON DELETE CASCADE chain (refs.in_symbol_id, attributes.symbol_id,
arch_*.parent FKs) plus per-file `DELETE FROM symbols WHERE file_id`
quietly handles re-index correctness for most tables — not the
`clear_all_X` pattern that lcb6 established. wsix's mental model
("UPSERT-only ⇒ needs clear_all_X") was a special case, not the
general principle. See `.rivets-wsix/what-i-learned.md` for the
per-table inventory.
Empirical probes (sqlite3 against a real tempdir workspace) contradicted
my initial code-reading hypothesis that `refs` would accumulate without
an explicit clear. The cascade via in_symbol_id catches it. Documented
in `.rivets-wsix/probe_refs_bug.sh` and `.rivets-wsix/probe_null_in_symbol.sh`.
Subsequent slices ship three integration tests that lock in the audited
cascade-correctness (claims 1-3 in `.rivets-wsix/design.md`) so future
schema changes can't silently regress the bug class wsix was looking for.
Also includes the `.rivets/issues.jsonl` status update marking wsix
in_progress.
Lock in design claim C1: removing a function-body call from a file's source produces exactly the expected row removals in `refs` after re-index, via the `refs.in_symbol_id REFERENCES symbols(id) ON DELETE CASCADE` chain triggered by the per-file `DELETE FROM symbols WHERE file_id` in `files.rs::upsert_file`. The wsix audit (see `.rivets-wsix/what-i-learned.md`) found that this cascade chain is the actual safety mechanism for refs re-index correctness — not the `clear_all_X` pattern lcb6 established. This test pins that mechanism. Stress fixture: starting `entry()` calls `helper::a()`, `helper::b()`, and `helper::c()`. Mutation removes the MIDDLE call. Defeats a hypothetical head-only / tail-only cascade bug: the assertions check specific surviving/removed refs by name, not just total count. TDD-inversion: commenting out files.rs:145 `DELETE FROM symbols WHERE file_id` produces refs_post=5 instead of 2 (3 stale + 2 new). The test's `assert_eq!(refs_post, 2)` fails with descriptive output. Confirms the fence is non-vacuous per the v1.0.3 falsifier-non-vacuity self-review rule. Per-gate verification: - `cargo nextest run -p tethys --test reindex_cascade` passes (~110ms) - `cargo clippy -p tethys --tests --all-features -- -D warnings` clean - `cargo fmt --check` clean - TDD-inversion fails the test as predicted
Lock in design claim C2: removing an attributed symbol from source cascade-deletes both the symbol AND its `attributes` rows via `attributes.symbol_id REFERENCES symbols(id) ON DELETE CASCADE`. Stress fixture: two attributed functions `target` and `keep`. Remove `target` from source. Defeats the "cascade too aggressive" bug class — if a regression cascade-cleared all of the file's attributes instead of just the removed symbol's, the `keep_attrs_post == keep_attrs_pre` assertion would catch it. TDD-inversion: relaxing the cascade FK to `ON DELETE NO ACTION` in schema.rs causes the symbol DELETE to fail entirely (FK constraint violation), and the test's `target_sym_post == 0` assertion fires. The fence catches the schema drift at one assertion level or another. Also adds two small helper fns (`count_attrs_for_symbol`, `count_symbols_by_name`) to keep individual test functions under clippy's 100-line limit. Per-gate verification: - `cargo nextest run -p tethys --test reindex_cascade` 2/2 pass - `cargo clippy -p tethys --tests --all-features -- -D warnings` clean - `cargo fmt --check` clean - TDD-inversion (cascade FK relaxed) fails the test as predicted
…(rivets-wsix slice 3) Lock in design claim C3: running `Tethys::index()` twice on an unchanged workspace produces identical row counts in `call_edges` and `file_deps`, AND a stable `SUM(file_deps.ref_count)`. Catches regression of the `clear_all_X` discipline established by lcb6 (file_deps) and its sibling call_edges precedent. The `SUM(file_deps.ref_count)` assertion defeats a specific bug class the row-count check alone misses: if `clear_all_file_deps` were removed, the same dep would be detected on each run, but the `ON CONFLICT DO UPDATE SET ref_count = ref_count + 1` clause would silently increment the aggregate. Row count stays equal; ref_count sum doubles. This is exactly the original lcb6 bug. TDD-inversion: commenting out `self.db.clear_all_file_deps()` at indexing.rs:139 produces SUM(ref_count) = 1 → 2 across re-index runs. The SUM assertion fires; the row-count assertions stay quiet (correctly — they would not catch this aggregate-growth class without the SUM check). Per-gate verification: - `cargo nextest run -p tethys --test reindex_cascade` 3/3 pass - `cargo nextest run -p tethys` 639/639 pass (final integration check) - `cargo clippy -p tethys --tests --all-features -- -D warnings` clean - `cargo fmt --check` clean - TDD-inversion (clear_all_file_deps disabled) fails the SUM assertion as predicted This completes the rivets-wsix gilfoyle loop. The audit found zero new bugs (see `.rivets-wsix/what-i-learned.md`); these three regression fences lock in the existing cascade and clear_all safety mechanisms so future schema or indexing changes can't silently regress them.
Code Review: PR #75 — cascade-correctness regression fences (rivets-wsix)Summary: Three integration tests added to What the PR does wellAdversarial fixture design — each test is specifically shaped to defeat the most likely false-positive:
Content-hash-based re-index triggering — the implementation wisely relies on content hash changes rather than mtime manipulation (the plan.md mentioned
Explicit Concurrency safety — each test uses its own Issues to raiseMinor: Missing The test asserts let c_refs = count_lib_refs_by_target_names(&conn, &["c"]);
assert_eq!(
c_refs, 1,
"ref to helper::c() must survive cascade (it's still in source) — got {c_refs}"
);This is a minor gap, but since the test's stated purpose is verifying which specific refs survive, completing the pair strengthens the claim. Worth noting: The test's query joins on Non-issue, but plan vs. implementation divergence: Test coverage assessment
The audit trail in VerdictApprove with the minor suggestion above. The |
There was a problem hiding this comment.
Code Review
This pull request introduces a suite of regression tests and comprehensive documentation following an audit of the database re-indexing logic. The audit verified that the ON DELETE CASCADE schema and clear_all_X patterns effectively prevent the accumulation of stale data during re-indexing. A new integration test file, crates/tethys/tests/reindex_cascade.rs, was implemented to lock in these behaviors for symbols, attributes, call edges, and file dependencies. The PR also includes various diagnostic scripts and design documents detailing the audit's methodology and findings. I have no feedback to provide as there were no review comments.
Round 1 of review feedback. Per the `.rivets-wsix/review-decisions-round-1.md` decision log: - **Accept (claim 1):** added explicit `c_refs == 1` assertion in `refs_cascade_on_call_removal`. Previously the third surviving ref was implicit (derivable from `refs_post − a_refs − b_refs = 2 − 1 − 0 = 1`). Spelling it out defends against future mutations to `count_lib_refs_by_target_names`'s IN-clause that would silently break that arithmetic. Symmetric with the existing `a_refs`/`b_refs` pattern. - **Reject (claim 2):** `count_lib_refs_by_target_names` joins on `r.symbol_id` (callee) rather than `r.in_symbol_id` (cascade trigger). The reviewer themselves noted this is a future-maintainer note, not actionable. The indirection is intentional — the regression fence verifies the *effect* of the cascade (refs disappear), not the trigger mechanism. Changing to `r.in_symbol_id` would weaken the fence by testing how the cascade fires instead of whether it produces the correct end state. - **Reject (claim 3):** plan.md / impl divergence on `filetime` vs. content-hash detection. Per CLAUDE.md "Issue diagnostic directories" convention, plan files are point-in-time, not maintained. `checkpointed-build` explicitly permits implementer deviation when the budget and oracle still hold; this is exactly that case.
Code Review — PR #75: cascade-correctness regression fences (rivets-wsix)Summary: Three new integration tests locking in SQLite cascade-correctness invariants for the tethys re-index pipeline. Zero production code changes; audit found zero bugs. This is a fence-only PR. Overall AssessmentThis is a well-executed regression-fence PR. The test design is adversarial, the oracles are independent (direct SQL, not tethys API), and the prior review round already addressed the asymmetric What works wellAdversarial fixture shapes
Independent oracle SQLite connection management Guard assertions Observations (non-blocking)Slice 1 oracle indirection
Fixture relies on Pass 2 cross-file resolution succeeding Audit trail qualityThe
Minor nit (truly optional)
VerdictApprove. The three fences cover the three distinct invariants the wsix audit identified. The TDD-inversion verification was done before commit. CI passes. The prior review round's accepted finding (symmetric |
Summary
crates/tethys/tests/reindex_cascade.rslocking in the cascade-correctness mechanisms tethys relies on for re-index correctness..rivets-wsix/) discovered the answer is "none — the schema cascade chain catches it everywhere else."The audit's key finding
refs,attributes, andsymbolsall rely on the schema'sON DELETE CASCADEchain rooted atsymbols(id), triggered by the per-fileDELETE FROM symbols WHERE file_id = ?infiles.rs::upsert_file. This is the actual safety mechanism for re-index correctness on those tables — not theclear_all_Xpattern lcb6 established forfile_depsandcall_edges. wsix's mental model ("UPSERT-only ⇒ needs clear_all_X") was a special case, not the general principle.Full per-table inventory and probe results in
.rivets-wsix/what-i-learned.md.The three fences (one slice per design claim)
66fde6e)refs_cascade_on_call_removalrefs.in_symbol_id ON DELETE CASCADEsilently relaxedDELETE FROM symbols WHERE file_idat files.rs:145 → refs_post = 5 (3 stale + 2 new), test fails onassert_eq!(refs_post, 2)eb1762d)attributes_cascade_on_symbol_removalattributes.symbol_id ON DELETE CASCADEsilently relaxedON DELETE NO ACTIONin schema.rs:108 → DELETE FROM symbols blocked by FK, test fails ontarget_sym_post == 05041b42)clear_all_tables_stable_under_reindexclear_all_file_depsremoved fromindex_with_optionsself.db.clear_all_file_deps()at indexing.rs:139 → SUM(file_deps.ref_count) 1→2 across re-index, test fails on the SUM assertion (the row-count assertions correctly stay quiet — only SUM catches aggregate growth)All three TDD-inversions confirmed empirically before commit (per the gilfoyle checkpointed-build skill discipline + the v1.0.3 falsifier-non-vacuity self-review rule from gilfoyle's recent PR #1).
Stress fixtures designed adversarially
SUM(ref_count)stability. Only the SUM check catches the original lcb6 UPSERT-aggregate-growth bug class; the row-count check alone would miss it.Audit trail
The complete gilfoyle workflow output is committed at
.rivets-wsix/:related-issues.md— tracker scan for prior art (5-min cap per prove-it-prototype step 0)probe.sh,oracle.sh,probe_refs_bug.sh,probe_null_in_symbol.sh— the empirical probes that contradicted my initial mental modelwhat-i-learned.md— the one-sentence summary + per-table inventory + surprisesdesign.md— falsifiable design with 3 claims + falsifier-non-vacuity self-reviewplan.md— budgeted plan with 3 slices, stress fixtures, and verification gatesTest plan
cargo nextest run -p tethys --test reindex_cascade→ 3/3 passcargo nextest run -p tethys→ 639/639 pass (full integration check)cargo clippy --all-targets --all-features -- -D warnings→ cleancargo fmt --check→ cleanWhat this PR does NOT do
.rivets-wsix/design.md.