Skip to content

Add SQLite + JSON1 backend for V_gamma document storage#122

Merged
stevevanhooser merged 4 commits into
V2from
claude/did-matlab-v2-step3-JxBfW
May 12, 2026
Merged

Add SQLite + JSON1 backend for V_gamma document storage#122
stevevanhooser merged 4 commits into
V2from
claude/did-matlab-v2-step3-JxBfW

Conversation

@stevevanhooser
Copy link
Copy Markdown
Contributor

Summary

Implements step 3 of the v2 architecture plan: a SQLite-backed storage engine with JSON1 query compilation. This adds the first concrete database backend for V_gamma documents, complementing the in-memory reference evaluator with a persistent, queryable store.

Key Changes

  • did2.database.sqlitedb — New storage backend class that:

    • Opens/creates SQLite databases via mksqlite with a v2-specific schema
    • Manages three core tables: documents (full JSON body + metadata), superclasses (class hierarchy for isa queries), and depends_on (dependency tracking)
    • Provides CRUD operations: add(), remove(), get(), has(), count(), allIds()
    • Implements search(query) and searchIds(query) with SQL pre-filtering + in-memory post-filtering
    • Includes schema validation on open to reject non-v2 databases
    • Supports Validate=false for bulk loads (unsafe_insert escape hatch)
  • did2.database.compileQuery — SQL compiler that translates did2.query objects to SQLite WHERE clauses:

    • Scalar operators (exact_string, contains_string, numeric comparisons) → json_extract() expressions
    • Array iteration ([*] paths) → EXISTS subqueries with nested json_each() joins
    • isa operator → indexed lookup on superclasses sidecar table
    • depends_on operator → indexed lookup on depends_on sidecar table with wildcard support
    • hasfieldjson_type() checks (treats JSON null as present)
    • hasmember and array-of-structures predicates → json_each() expansion
    • Negation handling with NULL-safe guards for missing paths
    • Permissive pre-filters (1=1) for operators SQLite cannot express natively (regexp, multi-element arrays)
  • Test coverage:

    • testCompileQuery.m — 20+ unit tests validating SQL output without requiring mksqlite
    • testSqliteDb.m — 25+ integration tests covering round-trip persistence, schema validation, CRUD, and query semantics across scalar/array/sidecar operators

Implementation Details

  • The compiler is conservative by design: it emits over-approximations (e.g., 1=1 for regexp) and relies on the in-memory evaluator (did2.query.matches) as a correctness backstop, ensuring the SQL layer is only required to be a pre-filter.
  • Foreign-key cascades automatically clean up sidecar rows when documents are deleted.
  • The schema includes a meta table for version tracking; the constructor validates that opened files are v2 databases.
  • Path handling supports dot-notation with [*] array-iteration segments; nested arrays compose as cross-products of json_each joins.
  • All tests skip gracefully if mksqlite is not available.

https://claude.ai/code/session_01JE5mYZsiyVBJYw55836w8h

Implements step 3 of docs/v2/PLAN.md §9 — the JSON1-fallback SQLite
backend that lets +did2 round-trip V_gamma documents end-to-end.

- src/did/+did2/+database/sqlitedb.m: V_gamma SQLite backend.
  Creates the documents / superclasses / depends_on tables from
  PLAN.md §3.1 (plus a meta key/value table that the constructor
  uses to reject non-v2 files). Foreign-key cascades clear the
  sidecar rows on remove. add/get/remove/has/count/allIds and
  search/searchIds; Validate=false skips schema validation for
  bulk loads.
- src/did/+did2/+database/compileQuery.m: did2.query -> SQL WHERE
  clause + bound parameters, JSON1 fallback only. json_extract
  for scalar leaves, EXISTS over chained json_each for [*] paths,
  the superclasses / depends_on sidecar tables for isa /
  depends_on, NULL-guarded negation so missing paths flip to true
  under ~. regexp and multi-element exact_number emit a permissive
  1=1; sqlitedb.search runs the in-memory evaluator over the SQL
  result set as a correctness backstop.
- tests/+did2/+unittest/testCompileQuery.m: string-based unit
  tests over the compiler output (no mksqlite required).
- tests/+did2/+unittest/testSqliteDb.m: integration tests that
  round-trip documents through a real SQLite file. Filters
  itself out via assumeFail when mksqlite is not on the path.
- Updates Contents.m and the progress log in docs/v2/PLAN.md.
Lets compileQuery(did2.query('x','regexp','y')) return '1=1' verbatim
(rather than '(1=1)'), which is what testCompileQuery's regexp test
asserts. Multi-element conjunctions still get the '(a) AND (b)' wrap.
@stevevanhooser stevevanhooser merged commit f648419 into V2 May 12, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant