Skip to content

did2: rename depends_on(k).value → document_id end-to-end#136

Merged
stevevanhooser merged 2 commits into
V2from
claude/did-schema-52-rename-depends-on-document-id
May 21, 2026
Merged

did2: rename depends_on(k).value → document_id end-to-end#136
stevevanhooser merged 2 commits into
V2from
claude/did-schema-52-rename-depends-on-document-id

Conversation

@stevevanhooser
Copy link
Copy Markdown
Contributor

Companion to waltham-data-science/did-schema#52 (merged). V_delta canonical depends_on entry shape is now {name, document_id}. This PR carries the rename through every did2 surface: universal-rename pass, on-disk SQL sidecar, query/validator/cache code, and tests.

Strategy

Writers always produce document_id. Readers tolerate document_id / value / id as synonyms so a body at any stage of the migration pipeline (raw v1, earlier V_delta draft, current V_delta) still has its references found. Precedence when more than one is populated: document_id > value > id.

What changed

Source

  • universalRenames.renameDependsOnEntries — rewritten to accept all three input shapes and emit document_id. version always dropped (V_delta has no per-document version branches).
  • +did2/+database/sqlitedb.m:
    • createSchema: depends_on table declares document_id column (was value); index renamed.
    • INSERT INTO depends_on writes the new column name.
    • dependsOnEntries sidecar-row builder reads synonyms.
    • assertSchema gained migrateDependsOnValueToDocumentId — one-shot introspect-and-ALTER on existing databases that still carry the old value column. pragma_table_info('depends_on') checks the actual column name; if value is present and document_id is not, runs DROP INDEX … RENAME COLUMN … CREATE INDEX inside a BEGIN/COMMIT. Idempotent on already-migrated DBs (early-returns when document_id is already present).
  • +did2/+database/compileQuery.mcompileDependsOn emits d.document_id = ? predicates. Top-of-file SQL signature comment updated.
  • +did2/query.m — in-memory opDependsOn reads synonyms.
  • +did2/+validate/references.m — orphan-report field edge_valueedge_document_id; iterator reads synonyms; doc updated.
  • +did2/+schema/cache.m::buildBlankDocument — blank doc's empty depends_on carries the new field name.

Tests

testConvertV1ToV2, testFromV1Database, testMigrators, testQuery, testReaders, testSchemaCache, testSqliteDb, testValidateReferences — depends_on construction + assertions updated. Test names like testUniversalRenamesPromotesDependsOnIdToDocumentId reflect the new canonical name.

Existing on-disk databases

First open after this PR lands invokes ALTER TABLE depends_on RENAME COLUMN value TO document_id + index swap. SQLite ≥ 3.25 (which mksqlite ships with) supports RENAME COLUMN natively. No JSON re-write needed for the documents.body blobs since did2.convert.v1_to_v2 already normalizes those on read.

Dependencies

Test plan

  • Migrator-level + query/validator/sqlitedb unit tests run in CI.
  • Corpus tests (PRED, 20211116, B, Dab, JH, Soph) exercise the full universalRenames → migrators → validator pipeline with the new key.
  • The DB-migration code path runs on every assertSchema call from existing test fixtures; the early-return on already-migrated DBs makes the test-suite re-runnable.

Unblocks

VH-Lab/NDI-matlab#801 — once this lands, the NDI-matlab compat-layer audit can resume with the right canonical name.


Generated by Claude Code

claude added 2 commits May 21, 2026 13:14
Closes #N/A (companion to Waltham-Data-Science/DID-schema#52, which
made the V_delta-side rename and merged to did-schema main).

V_delta's canonical depends_on entry shape is now
{name, document_id}. did-schema#52 settled the rename rationale
(`value` was generic but uninformative; every entry's value is
always a did_uid referring to another document). This PR carries
the rename through every did2 surface: the universal-rename pass,
the on-disk SQL sidecar table, the query/validator/cache code,
and the test corpora.

Compat strategy: each on-the-wire reader/parser tolerates the
earlier draft key `value` and the raw v1 key `id` as synonyms for
`document_id`, so a body at any stage of the migration pipeline
still finds its references. The writer always produces
`document_id`.

Changes:

- src/did/+did2/+convert/universalRenames.m::renameDependsOnEntries
  rewritten to accept document_id / value / id on input (precedence
  document_id > value > id), produces document_id, drops legacy
  keys.

- src/did/+did2/+database/sqlitedb.m:
  * createSchema: depends_on table now declares `document_id`
    column (was `value`) and the index is renamed accordingly.
  * INSERT INTO depends_on writes the `document_id` column.
  * dependsOnEntries iterator (the sidecar-row builder used at
    write time) reads document_id / value / id synonyms.
  * assertSchema gained migrateDependsOnValueToDocumentId: one-shot
    introspect-and-ALTER on existing databases that still carry
    the old `value` column. Wrapped in BEGIN/COMMIT so a torn
    migration rolls back; idempotent on already-migrated DBs
    (early-returns if document_id is already present).

- src/did/+did2/+database/compileQuery.m: compileDependsOn now
  emits `d.document_id = ?` predicates against the renamed
  sidecar column. Top-of-file SQL signature comment updated.

- src/did/+did2/query.m: opDependsOn (in-memory evaluator) reads
  document_id / value / id synonyms.

- src/did/+did2/+validate/references.m: documentation, orphan
  report field `edge_value` -> `edge_document_id`, dependsOnEntries
  iterator reads synonyms.

- src/did/+did2/+schema/cache.m::buildBlankDocument: the empty
  depends_on struct array now carries field `document_id`.

- Tests: testConvertV1ToV2 (universal-rename behaviour),
  testFromV1Database, testMigrators (calcCommon depends_on drop),
  testQuery (depends_on op), testReaders, testSchemaCache (blank
  doc), testSqliteDb (search by depends_on),
  testValidateReferences (orphan field + setDependsOn helper)
  all updated. Names like
  testUniversalRenamesPromotesDependsOnIdToDocumentId reflect the
  new canonical name.

Existing sqlite databases on disk: on first open after this PR
lands, the depends_on table's value column is renamed in place
via ALTER TABLE RENAME COLUMN (SQLite >= 3.25, satisfied by
mksqlite's bundled SQLite). The index is dropped + recreated so
queries keep using it. No JSON re-write needed since the doc
bodies in the documents.body column are kept in sync by
did2.convert.v1_to_v2 on read normalization.
Two follow-ups to the depends_on -> document_id rename:

1. testCompileQuery: testDependsOnUsesSidecar and testDependsOnWildcard
   still asserted the old `d.value = ?` SQL substring. Updated both
   to expect `d.document_id = ?` (matching the rewritten
   compileDependsOn output).

2. sqlitedb.migrateDependsOnValueToDocumentId: the migration's column
   introspection used `pragma_table_info('depends_on')`, which the
   codebase already documents as unreliable on the CI mksqlite +
   sqlite combo (see currentQueryableColumns() comment block on
   line ~348). Symptoms in CI: testSqliteDb/testReopenExistingDatabase,
   testRebuildPreservesDataOnSchemaMismatch, and
   testSidecarReconcileRepopulates all errored, plus three
   testFromV1Database round-trip tests that go through assertSchema
   on existing databases.

   Replaced with the zero-row SELECT probe pattern
   currentQueryableColumns uses (new helper:
   dependsOnHasColumn(name)). Same migration logic, reliable probe.
@stevevanhooser stevevanhooser merged commit cec070a into V2 May 21, 2026
3 checks passed
@stevevanhooser stevevanhooser deleted the claude/did-schema-52-rename-depends-on-document-id branch May 21, 2026 14:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants