feat(doctor): detect + repair index-integrity corruption in doctor --fix#64
Merged
Merged
Conversation
…-fix` A file→package refactor (e.g. `m.py` → `m/__init__.py`, same module qualname) leaves a stale file entity and a dangling `contains` edge in the cumulative index. The SEI orphan pass only retires them *after* phase3's parent/contains flush, so `analyze` aborts with `LMWV-INFRA-PARENT-CONTAINS-MISMATCH` and there was no recovery short of nuking the index. (Root-cause writer fix tracked in clarion-abda98c869; this is the recovery surface.) `loomweave doctor` now runs an `index.integrity` check: - detects stale file entities (a `core:file:*` whose path is gone from disk) and both directions of the parent/contains invariant (ADR-026 decision 2); - `--fix` removes each stale file entity and everything anchored to it (`source_file_id`); edges/tags/taint/caches cascade. The delete runs under `defer_foreign_keys = ON` and nulls the four NO-ACTION FK columns into `entities(id)` that do not cascade (`entities.parent_id`/`.source_file_id`, `edges.source_file_id`, `entity_unresolved_call_sites.source_file_id`) so a surviving row is never left dangling at commit; - residual violations not attributable to a stale file are reported with a `analyze --no-incremental` rebuild recommendation. New `loomweave-storage::integrity` module (check_integrity / repair_integrity) with 4 integration tests incl. a regression for the cross-file `edges.source_file_id` dangling case. Wired into doctor's text + JSON paths, gated on a healthy/migrated DB. Verified on the live elspeth index: detected 9 stale file entities + 2 parent/contains mismatches; `--fix` removed them (169 entities total); analyze then completed (44076 entities, 147044 edges) where it previously aborted. Relates to clarion-abda98c869; implements clarion-ae7e48003a Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
loomweave doctorgains anindex.integritycheck, anddoctor --fixrepairs the corruption that abortsanalyzeat phase3 withLMWV-INFRA-PARENT-CONTAINS-MISMATCH— the recovery surface for clarion-abda98c869.A file→package refactor (
m.py→m/__init__.py, same module qualname) leaves a stale file entity + a danglingcontainsedge in the cumulative index. The SEI orphan pass only retires them after phase3's parent/contains flush, so the run aborts and there was no recovery short of nuking the index.Detection (read-only)
core:file:*whose source path is gone from disk.Repair (
--fix)Removes each stale file entity + everything anchored to it (
source_file_id); edges/tags/taint/caches cascade. Deletes underdefer_foreign_keys = ONand nulls the four NO-ACTION FK columns intoentities(id)that do not cascade —entities.parent_id,entities.source_file_id,edges.source_file_id,entity_unresolved_call_sites.source_file_id— so no surviving row dangles at commit. Residual corruption is reported with aanalyze --no-incrementalrebuild recommendation.Verified on the live elspeth index
9 stale file entities + 2 parent/contains mismatches.--fix:removed 9 stale file entities (169 entities total); index is now consistent.analyzethen completed (44076 entities, 147044 edges) where it previously aborted at phase3.The first naive cut hit
FOREIGN KEY constraint failedat commit (81 danglingedges.source_file_idbetween surviving endpoints) — fixed by the null-out pass and pinned by a regression test.Implementation
loomweave-storage::integrity(check_integrity/repair_integrity) + 4 integration tests (detect, repair+restore, dangling-edge regression, healthy no-op).Tests
Full floor green: 1854 nextest (+4), clippy
-D warnings, doc, fmt; Python 215.Relates to clarion-abda98c869; implements clarion-ae7e48003a
🤖 Generated with Claude Code