V_delta: drop redundant status fields; restore v1 session_in_a_dataset shape by stevevanhooser · Pull Request #44 · Waltham-Data-Science/DID-schema

stevevanhooser · 2026-05-15T18:41:39Z

Driven by the B-corpus discovery (12,917 v1 docs, 18 classes, 8708 quarantined under the previous schemas). Every quarantined class is addressed here on the schema side — no paired did-matlab change required.

Design point

Confirmed with the maintainer: V_delta documents are immutable. "Did this happen?" tracking fields are therefore redundant — the existence of the document is the state. Five required fields that violated this principle are dropped.

Dropped fields

schema	field dropped	reason
`epochfiles_ingested`	`ingestion_status` (required), `num_files_ingested` (derived)	presence = ingested
`daqreader_mfdaq_epochdata_ingested`	`ingestion_status` (required)	presence = ingested
`daqmetadatareader_epochdata_ingested`	`ingestion_status` (required)	presence = ingested
`syncrule_mapping`	`mapping_status` (required)	presence = mapped
`dataset_remote`	`remote_url` (required)	not needed; v1 has `dataset_id` + `organization_id`

After the drops, these classes either become marker-style records with zero own fields, or retain only their genuinely-content fields (syncrule_mapping keeps mapping_data; dataset_remote keeps remote_type and dataset_id).

`session_in_a_dataset` restored to v1 shape

The earlier V_delta draft had stripped this class down to a required dataset_id (char) plus a depends_on[session_id]. Both are wrong:

dataset_id is intrinsically base.session_id. In v1's storage model, the document is inside a dataset, and the dataset's identity is its session-id, which lives on every contained document's base.session_id. So dataset_id was duplicating base.session_id without new information.
v1 stored session_id inline as a property-block field, not as a depends_on edge.
v1 also carried session-reconstitution metadata that V_delta had dropped: session_reference (recording-date label), is_linked, session_creator (e.g., "ndi.session.dir"), and session_creator_input1..6.

The rebuilt schema mirrors v1 exactly: session_id (did_uid, required, inline) + session_reference + is_linked + session_creator + session_creator_input1..6.

Verification

Meta-schema validation passes on all six touched schemas.
Existing pytest suite stays at 96/96 passing (V_beta + V_gamma coverage).
index.json and topics.json need no edits (they only carry class-level metadata, not field defs).

Projected impact on the discovery corpora (via did-matlab simulator)

corpus	before	after
PRED	14/14	14/14
20211116	1220/1220	1220/1220
B	4209/12917	12917/12917

Coordination

No paired did-matlab PR needed. The converter's universal-rename pass plus the dispatcher empty-block pad already handle every v1 doc that previously quarantined here. Adding a third corpus fixture (B.zip) to did-matlab's discovery tests is a separate follow-up on that side.

Out of scope (still pending)

did2.validate.references for depends_on referential integrity at DB-ingest time.
Step 7 NDI-matlab port.

Generated by Claude Code

…t shape Driven by the B-corpus discovery report (12,917 v1 docs across 18 classes, 8708 quarantined under the previous schemas). All six classes the corpus surfaces as newly affected are addressed here. Design principle confirmed: V_delta documents are immutable, so "did this thing happen?" tracking fields are redundant -- the existence of the document IS the state. Five required fields that violated this are dropped. Dropped fields: epochfiles_ingested.ingestion_status (required) epochfiles_ingested.num_files_ingested (derived metadata) daqreader_mfdaq_epochdata_ingested.ingestion_status (required) daqmetadatareader_epochdata_ingested.ingestion_status (required) syncrule_mapping.mapping_status (required) dataset_remote.remote_url (required) After the drops these classes either have zero own fields (marker-style records whose presence is the entire signal) or retain only their genuinely-content fields (e.g., syncrule_mapping keeps `mapping_data`; dataset_remote keeps `remote_type` and `dataset_id`). session_in_a_dataset rebuilt to mirror v1 verbatim: before: required `dataset_id` field + required depends_on[session_id] after: session_id (did_uid, required, inline) + session_reference + is_linked + session_creator + session_creator_input1..6 The "dataset_id" concept is intrinsically `base.session_id` in v1's storage model -- the document is found inside a dataset, and the dataset's identity is the dataset's session-id, which lives on every contained document's base block. So the V_delta-only `dataset_id` field was duplicating base.session_id without adding new information. Restoring the full v1 field set also recovers the session-reconstitution metadata (session_creator, session_creator_input1..6, session_reference, is_linked) that the earlier V_delta draft had stripped. Projection on the three discovery corpora (Python simulator): PRED 14/14 migrated (no change) 20211116 1220/1220 migrated (no change) B 12917/12917 migrated (was 4209/12917) No paired did-matlab change is required: the converter's universal-rename pass plus the dispatcher empty-block pad already handle every v1 doc that previously quarantined here. The existing testCorpus* tests continue to gate PRED at zero quarantine and run 20211116 in discovery mode; a third corpus fixture (B) can be added on the did-matlab side as a separate follow-up.

Driven by the JH corpus surfacing the multi-clock case: v1 documents that record the same element epoch in multiple clock frames (e.g., `epoch_clock = "dev_local_time,exp_global_time"`, `t0_t1 = [[a,b],[c,d]]`). The previous schema required mustBeScalar epoch_clock/t0/t1 and could not represent it. Replaces the three flat scalar fields with a single `element_epoch.clocks` array-of-records: clocks(i).name char required per-clock identifier clocks(i).t0 double required start time in clocks(i) units clocks(i).t1 double required stop time in clocks(i) units Single-clock documents (PRED, 20211116, B corpora) migrate to a 1-element array; multi-clock JH documents migrate to a 2+ element array. mustBeScalar on the parent clocks field is `false`; mustBeNonEmpty is `true` so empty/absent timing data still quarantines. The queryable flag stays on `clocks` so query callers can join via sidecar array-element rows. No paired ngrid/epochclocktimes changes here. The epochclocktimes superclass (sibling of element_epoch with the same scalar-field shape) keeps its current schema for this PR -- so far it's only used as a *superclass* on classes whose v1 form has the single-clock pattern (pyraview in PRED is the only test). If a multi-clock case for an epochclocktimes-using class surfaces, we do the same array-of-records flip there too. Projection on the 20211116, B, and JH corpora after this commit (paired with the did-matlab migrator update): PRED 14/14 migrated (no change) 20211116 1220/1220 migrated (no change) B 12917/12917 migrated (no change) JH 67172/78688 migrated (was 63016; element_epoch +4156)

The JH corpus surfaces 2078 v1 position_metadata documents, all purely descriptor data (no x/y/z values, no files block). The V_delta draft had inverted the v1 intent by storing concrete numeric coordinates instead of the ontology-driven shape v1 actually carried. Aligned with the design used by `probe_location` elsewhere in V_delta (`location` field as an ontology_term). Rewrites position_metadata to mirror the v1 shape: measurement (ontology_term) required The v1 `ontologyNode` CURIE verbatim, paired with a human- readable name resolved via ndi.ontology.lookup. Classifies *what kind of position* the linked element records (e.g., a midpoint position, a probe tip). units (ontology_term) required The v1 `units` CURIE; same lookup convention as measurement. dimensions (structure-array) optional One record per spatial axis. Records carry an explicit `axis` identifier (defaults to positional `axis_1`, `axis_2`, ...) so queries can filter by axis without resolving the ontology, plus `node` (CURIE) and `name` (resolved label). v1 stored dimensions as a comma-separated CURIE list with implicit positional ordering; this schema preserves that ordering and adds the explicit axis tag. Adds a `depends_on[element_id]` since v1 documents always carry that link (was missing on the previous draft). Migrator-side changes land in the paired did-matlab commit. Projection on JH after this commit (paired with the did-matlab migrator): 76257/78688 migrated (was 74179; position_metadata +2078). Remaining JH quarantine: distance_metadata (2078) and subject_group (353), same class-by-class follow-up.

The JH corpus surfaces 2078 v1 distance_metadata documents whose content is paired A/B endpoint metadata, not a scalar distance. Each endpoint carries an ontology classification, a set of integer indices, a comma-separated did_uid list, and an optional numeric values vector; the document records the distance-measurement schema between the two endpoint sets, not the distances themselves. Rewrites distance_metadata to match the v1 shape using the same array-of-records pattern as position_metadata.dimensions: endpoints: structure-array endpoints(i).label (char, required) Per-endpoint identifier preserved verbatim from v1: 'A', 'B'. Mirrors position_metadata's explicit axis labels so queries can filter by endpoint without resolving ontology. endpoints(i).measurement (ontology_term, required) v1 ontologyNode_X CURIE + ndi.ontology.lookup name. endpoints(i).integer_ids (matrix of integer, optional) v1 integerIDs_X verbatim. Often a scalar for endpoint A, a multi-element vector for endpoint B. endpoints(i).string_ids (string array, optional) v1 ontologyStringValues_X parsed (comma-split) into a string array. Each entry is typically a did_uid pointing at another document. endpoints(i).numeric_values (matrix of double, optional) v1 ontologyNumericValues_X. Often empty in the corpora seen so far; preserved for forward compatibility. units (ontology_term, required) Replaces the previous schema's `distance_units` char. v1 already stored this as a CURIE. The previous schema's `distance` scalar and depends_on edges (`element_id_1`/`element_id_2`) are dropped: v1 had neither. depends_on becomes a single `element_id` matching the v1 idiom. Projected JH after this commit (paired with the did-matlab migrator): 78335/78688 migrated. Remaining: subject_group (353) and Dab stimulus_bath (1605), unchanged.

The JH corpus surfaces 353 v1 subject_group documents whose property block is universally empty (`{}`) and whose base.name is also universally empty -- the class is a pure relational marker. Subject membership is recorded via depends_on edges (`subject_id_1`, `subject_id_2`, ...). The previous V_delta draft required a `group_name` v1 never recorded and added a `subject_ids` char field that duplicated depends_on as a delimited string. Changes: - `group_name` mustBeNonEmpty: true -> false. v1 docs migrate with this field absent; new documents may populate it for ad-hoc labeling. - `description` unchanged (already optional). - `subject_ids` field removed entirely. The depends_on array already carries typed `subject_id_N` edges; a parallel comma-separated char field is redundant and worse than the structured form. No migrator change needed. v1 subject_group bodies flow through universalRenames unchanged (empty block stays empty; depends_on entries are renamed from id->value). Projection on JH after this commit: 78688/78688 migrated. Remaining quarantine: only Dab stimulus_bath (1605).

Driven by the Dab corpus (1605 v1 stimulus_bath documents). v1 stores an ontology-typed bath location plus an inline CSV table of chemicals, each with its own ontology classification and concentration. The previous V_delta draft assumed one solution name plus a scalar concentration -- it can't represent v1's multi-chemical baths. ## New composite type: concentration The other SI composites (duration / volume / mass / length / voltage / current / frequency) all share one canonical sub-field (meters, volts, ...) because their source units convert to that canonical by a single scalar. Concentration does not have that property: mass-per-volume cannot be converted to molar without molecular weight, and vice versa. Forcing a single canonical would make any concentration that ships without MW uninterpretable. `concentration` therefore has multiple OPTIONAL canonical sub-fields, with the migrator populating whichever the source unit is computable into: molar (double, opt) mol/L grams_per_liter (double, opt) mass/volume mass_fraction (double, opt) w/w (dimensionless 0-1) volume_fraction (double, opt) v/v (dimensionless 0-1) approximate (boolean) source_unit (char) verbatim source unit text source_value (double) verbatim source value Added to the meta-schema type enum and described in the top-level meta-schema documentation alongside the existing SI composites. The did-matlab validator switch is updated in a paired commit to accept the new type (same isstruct check as other composites). ## stimulus_bath redesign Replaces solution_name/concentration/concentration_units with: super: [base, epochid] (epochid restored to match v1) depends_on: [stimulus_element_id] (renamed from element_id) fields: * location (ontology_term, REQ) the bath itself mixture (structure, array-of-records): chemical (ontology_term, REQ) amount (concentration, opt) The migrator parses the v1 CSV mixture_table (header row + chemicals), each row producing a record with chemical (node+name from v1 ontologyName/name) and amount (concentration composite from v1 value/unitName). Projection after this commit (paired with did-matlab migrator): PRED 14/14 migrated unchanged 20211116 1220/1220 migrated unchanged B 12917/12917 migrated unchanged JH 78688/78688 migrated unchanged Dab 27561/27561 migrated (was 25956; +1605 stimulus_bath) All five discovery corpora round-trip clean. See did-schema review issue #46 for the detailed conversion table and the full design rationale.

The Soph corpus surfaces a single v1 metadata_editor document whose content does not match the previous V_delta draft. v1 uses the class to store dataset-level descriptors (VersionIdentifier, License, DataType, etc.); the previous draft modeled it as "metadata about an editor tool" (editor_class + target_classname), which v1 never recorded. Rewrites metadata_editor to match v1 exactly: super: [base] fields: metadata_structure (structure, optional) The single open-shape `metadata_structure` field carries the arbitrary key/value pairs v1 stored; keys and inner shape are intentionally not constrained by V_delta since they vary by editor and dataset. The dropped `editor_class` and `target_classname` fields had no v1 source. Projection on Soph after this commit: 101427/101427 migrated. All six discovery corpora round-trip clean.

Driven by the strict-validation audit (did-matlab commit aaf5529): every v1 tuningcurve_calc body in the discovery corpora ships two fields V_delta did not declare, so the data was landing in the property block as undeclared extras and the strict validator quarantined them with did2:validation:undeclaredField. The two fields are tuningcurve_calc-specific (no other *_calc class in any of the six discovery corpora ships them), so they land directly on tuningcurve_calc rather than being promoted to the calculator base. tuningcurve_calc.log (char, optional) Free-text log entry summarising the calculation, e.g. 'angle best value is 135.', 'sFrequency = 0.04'. tuningcurve_calc.stim_property_list (structure, optional) Stimulus-property name/value pairs that conditioned this tuning curve. v1 shape preserved verbatim: names (string array, optional) - e.g. {'sFrequency'} values (matrix, optional) - corresponding scalar/array Cleared corpus quarantines: Soph 34606 tuningcurve_calc docs 20211116 84 tuningcurve_calc docs total 34690 fewer undeclared-field errors The next biggest cluster surfacing in the strict-mode report is the v1 `stimulus_tuningcurve` inheritance on tuningcurve_calc, which V_delta currently drops. Separate follow-up.

Driven by strict-validation audit (did-matlab aaf5529). When PR #44 dropped the redundant `*_status` fields from these classes, the schemas were left essentially empty -- every v1 content field then surfaced as undeclaredField under strict mode. Restored field declarations to match v1 verbatim: syncrule_mapping: cost (double, opt) numeric mapping score mapping (matrix, opt) 2-element numeric vector epochnode_a (structure) {epoch_clock, epoch_id, epoch_session_id, epochprobemap, objectclass} epochnode_b (structure) same shape as epochnode_a (dropped: mapping_data -- was aspirational, never in v1) epochfiles_ingested: epoch_id (char, opt) epoch identifier (e.g., 't00001') files (string, opt, list of file references; entries are !mustBeScalar) NDI URIs or absolute paths epochprobemap (char, opt) tab-separated probemap text daqreader_mfdaq_epochdata_ingested: parameters (structure, opt) {sample_analog_segment, sample_digital_segment} daqmetadatareader_epochdata_ingested: unchanged (v1 block is empty; strict validator already passes). Projection delta on the discovery corpora (Python simulator): B 1738 -> 4222 migrated (+2484 syncrule_mapping cleared) Dab 10375 -> 12859 migrated (+2484 syncrule_mapping cleared) Soph 36826 -> 37174 migrated (+348 syncrule_mapping cleared) PRED, 20211116, JH: unchanged (these corpora either don't have these classes or have different residual issues). Residual sub-issues for these classes (separate follow-ups): epochfiles_ingested.files declared as type=string mustBeScalar=false, but v1 decodes to a cell array of chars in MATLAB; the validator's string case currently accepts only ischar/isstring, not iscell. Either a one-line validator relaxation or a per-class migrator that converts cell -> string array clears this. daqreader_mfdaq_epochdata_ingested inherits from a v1 parent `daqreader_epochdata_ingested` that V_delta does not declare; the v1 doc carries an inherited block that surfaces as undeclaredBlock. Add the parent class to V_delta (most v1-faithful) or have a migrator drop the inherited block. The simulator over-reports vs real MATLAB on cell-of-chars (real MATLAB also rejects it for type=string today; same fix needed in both places). The corpus CI run will confirm the authoritative numbers.

Driven by strict-validation audit. Three changes: 1. Declare `stimuli` (structure, optional, mustBeScalar=true) on stimulus_presentation. v1 carries it on every doc as `{parameters: {...stimulus-type-specific keys...}}` -- the parameters sub-shape depends on the stimulus generator (Hartley basis, sparse-noise grid, oriented gratings) and on the generating library version. V_delta declares `stimuli.parameters` but does not constrain inner keys. 2. Drop `num_trials` (integer field). v1 never ships it; was aspirational. 3. Restore v1 superclasses dropped by the previous V_delta draft: base + app + epochid v1 stimulus_presentation documents carry app provenance metadata (NDI calculator name + version + interpreter) and an epochid linking to the trial epoch. The previous draft only declared base, so the multi-inheritance walk in the strict validator reported `app` and `epochid` blocks as undeclared. `presentation_order` is also slightly loosened: dropped the `element_type: integer` constraint and the integer-array-only documentation. v1 corpora ship a scalar 1 in every doc seen, not a per-trial vector; treating it as an open matrix is more faithful. Projection delta (Python simulator) on the four corpora that ship stimulus_presentation docs: 20211116 542 -> 553 migrated (+11) B 4222 -> 5464 migrated (+1242) Dab 12859 -> 14101 migrated (+1242) Soph 37174 -> 37349 migrated (+175) Total: 2670 stimulus_presentation docs now migrate cleanly under strict validation. No paired did-matlab change required.

Same pattern as stimulus_presentation: v1 ships 7 content fields plus an `app` superclass, V_delta drafted 4 different fields and declared only base as a superclass. Strict-validation audit surfaced 1511 docs across 20211116 and Soph hitting undeclaredField / undeclaredBlock failures. v1 ships uniformly across all 1511 docs: cluster_index int number_of_channels int number_of_samples_per_channel int mean_waveform matrix (samples x channels) waveform_sample_times matrix quality_label char (e.g., 'good', 'multi') quality_number int (sorter-specific grade) [superclasses: base, app] V_delta now declares all of these and lists `app` as a superclass so v1's app block (provenance: ndi.spike_sorter app name + version + interpreter) validates against the app schema. Dropped from the previous V_delta draft (no v1 source): quality (replaced by quality_label, which is the v1 spelling) num_spikes (derivable from waveform/spiketimes data downstream) mean_firing_rate (derivable; was annotated as a `frequency` composite candidate but no v1 doc ships it) Projection delta (Python simulator): 20211116 553 -> 574 migrated (+21 neuron_extracellular) Soph 37349 -> 38839 migrated (+1490 neuron_extracellular) Total 1511 neuron_extracellular docs now migrate cleanly under strict validation. No paired did-matlab change required.

Same pattern as stimulus_presentation / neuron_extracellular. Strict-validation audit surfaced 11448 openminds-family docs hitting undeclaredField (openminds itself) and undeclaredBlock (its subclasses) failures. v1 design: openminds (8 docs, JH) fields: openminds_type (URL IRI), matlab_type (MATLAB class name), openminds_id (instance IRI), fields (open-shape struct, openminds-type-specific) openminds_subject (10401 docs, Dab + JH) block: {} -- empty; superclass=[base, openminds] openminds_element (404 docs, Dab) block: {} -- empty; superclass=[base, openminds] openminds_stimulus (635 docs, Dab) block: {} -- empty; superclass=[base, openminds, epochid] Changes: openminds -- field set rewritten to match v1 verbatim: + matlab_type, openminds_id, fields (struct, open) = openminds_type (was already there; doc string updated to reflect the full IRI v1 ships rather than the short `core.Person` style note) - openminds_data, openminds_version (no v1 source) openminds_subject -- add `openminds` as superclass. openminds_element -- add `openminds` as superclass. openminds_stimulus -- add `openminds` and `epochid` as superclasses. Subclass blocks stay empty (v1-faithful: the rich content lives in the inherited openminds block; subclasses are marker types identifying what kind of entity the openminds metadata describes). Projection delta (Python simulator): Dab 14101 -> 16445 migrated (+2344 openminds_*) JH 62584 -> 71624 migrated (+9040 openminds + openminds_subject) Soph 38839 -> 38903 migrated (+64 openminds_subject) Total: +11448 openminds-family docs cleared (matches the audit count exactly).

…_calc inherits it The biggest remaining strict-validation cluster: tuningcurve_calc documents in 20211116 (84) and Soph (34606) carry a v1 `stimulus_tuningcurve` superclass block that V_delta did not declare. The previous V_delta stimulus_tuningcurve schema only declared 4 fields (independent_variable, independent_values, response_mean, response_stderr), but v1 ships 16, with different naming conventions. v1 stimulus_tuningcurve uniformly ships across 34690 docs: independent_variable_label string array independent_variable_value matrix (N_stim x N_var) stimid matrix integer response_mean matrix response_stddev matrix response_stderr matrix response_units string array individual_responses_real matrix (N_trial x N_stim) individual_responses_imaginary matrix (N_trial x N_stim) stimulus_presentation_number matrix integer control_stimid matrix integer control_response_mean matrix control_response_stddev matrix control_response_stderr matrix control_individual_responses_real matrix (N_trial x N_stim) control_individual_responses_imaginary matrix (N_trial x N_stim) Changes: stimulus_tuningcurve -- rewritten to declare all 16 v1 fields verbatim. The previous schema's `independent_variable` and `independent_values` (camel-style aspirational draft) are dropped in favour of the v1-faithful `independent_variable_label` and `independent_variable_value`. response_mean and response_stderr are preserved as-is (already matched). response_stddev, the control_* family, individual_responses_*, stimid, stimulus_ presentation_number, and response_units are added. tuningcurve_calc.superclasses += stimulus_tuningcurve. The v1 inheritance puts stimulus_tuningcurve in the tuningcurve_calc chain; the previous V_delta draft had only base + calculator, so the strict validator flagged the v1 stimulus_tuningcurve block as undeclared. Paired with did-matlab change to relax `type=string` validator to also accept cell-of-chars (MATLAB's jsondecode produces cells for JSON arrays of strings; both `independent_variable_label` here and `epochfiles_ingested.files` need that). Projection delta on the discovery corpora (Python simulator): 20211116 574 -> 658 migrated (+84 tuningcurve_calc) B 5464 -> 7948 migrated (+2484 epochfiles_ingested) Dab 16445 -> 20533 migrated (+4088 epochfiles_ingested + co-resident classes) Soph 38903 -> 73858 migrated (+34955 tuningcurve_calc + epochfiles_ingested) JH 71624 unchanged (no tuningcurve_calc / epochfiles_ingested affected here; JH's remaining quarantines are image_stack and a few small clusters) Single largest schema cleanup so far: ~42K docs cleared across four corpora.

Strict-validation audit surfaced 7007 image_stack docs in JH all failing on undeclaredField image_stack.label. v1 ships a completely different field set than the V_delta draft, and v1 inherits from imageStack_parameters (camelCase block name that wasn't snake-cased on the v2 side). v1 imageStack ships uniformly: label (char) free-text caption formatOntology (char) ontology CURIE classifying the format [superclasses: base, imageStack_parameters] v1 imageStack_parameters ships uniformly: dimension_order (char) axis order, one char per axis dimension_labels (char) comma-separated per-axis labels dimension_size (matrix) per-axis pixel/sample counts dimension_scale (matrix) per-axis physical scale dimension_scale_units (char) comma-separated per-axis units data_type (char) pixel data type ("uint16", etc.) data_limits (matrix) [min, max] pixel range timestamp (double) acquisition timestamp clocktype (char) clock identifier V_delta drafts had unrelated fields: image_stack: num_frames / x_pixels / y_pixels / image_format image_stack_parameters: z_step / z_units / x_pixel_size / y_pixel_size / pixel_units Rewrites both to match v1 verbatim and adds image_stack_parameters as a superclass of image_stack so the v1 inheritance is honoured. The v1 block-name `imageStack_parameters` (camelCase) is snake-cased to `image_stack_parameters` by the paired did-matlab change to universalRenames (which now snake-cases all top-level block keys, not just the concrete-class key). Projection delta on JH (the only corpus with image_stack): 71624 -> 78631 migrated (+7007 image_stack docs cleared). PRED stays 14/14; other corpora unchanged (no image_stack docs).

Two strict-validation clusters resolved together: 1. stimulus_response_scalar_parameters_basic field rewrite v1 ships uniformly across 11440 docs: temporalfreqfunc (char) freq_response (integer 0/1) prestimulus_time (matrix) prestimulus_normalization (matrix) isspike (integer 0/1) spiketrain_dt (double) [superclasses: base, stimulus_response_scalar_parameters] V_delta drafted: response_window_start, response_window_end, freq_response Rewritten to declare all 6 v1 fields verbatim; dropped the aspirational response_window_* fields. Added stimulus_response_scalar_parameters as a superclass so the v1 inheritance is honoured. 2. spatial_frequency_tuning / temporal_frequency_tuning rename V_delta drafts both had a `fit_sgauss` field; v1 corpora uniformly use `fit_gausslog`. Same field, drafted under a shorter spelling that v1 never adopted. Renamed both occurrences (field name + cross-reference in the `abs` documentation). Projection delta on the corpora that carry these classes: 20211116 658 -> 931 migrated (+273 stimulus_response_*_basic) Soph 73858 -> 90629 migrated (+16771 spatial_/temporal_calc +stimulus_response_*_basic) ~17K docs cleared. JH stays at 78631/57; B and Dab unchanged (no spatial/temporal calc docs in those).

Brings all 6 v1 corpora (PRED, 20211116, B, Dab, JH, Soph; 221,827 docs total) to 100% clean migration with zero quarantines. Per-class changes (all to match v1-faithful shapes the corpora actually ship): - stimulus_response: replace `response_type` with the two v1 fields `stimulator_epochid` and `element_epochid` (response_type already lives on the child stimulus_response_scalar block). - stimulus_response_scalar: pre-existing class header; no field change. - control_stimulus_ids: add `app` superclass (v1 ships a populated app block on these documents). - probe_location: switch from composite `location: ontology_term` back to v1's flat `ontology_name` + `name` chars. - treatment: switch from composite `treatment_name: ontology_term` back to v1's flat `ontology_name` + `name` chars (keeping numeric_value / string_value unchanged). - daqreader_epochdata_ingested: drop ingestion_status marker; add the `epochtable` struct (epochclock string-array + t0_t1 [t0,t1] pair). - daqreader_mfdaq_epochdata_ingested: add `daqreader_epochdata_ingested` and `epochid` to superclasses; drop redundant local depends_on. - daqmetadatareader_epochdata_ingested: add `epochid` superclass. - jrclust_clusters: replace aspirational num_clusters/jrclust_version with v1's `res_mat_md5_checksum`; add `app` superclass. - dataset_remote: add `organization_id` field. - app: relax mustBeNonEmpty on app_name so legacy v1 docs that ship an empty app block (e.g. some jrclust_clusters) still validate. Validation: pytest 96/96 green.

claude added 16 commits May 15, 2026 18:41

stevevanhooser merged commit eab2c63 into main May 17, 2026
4 checks passed

stevevanhooser deleted the claude/did-matlab-v2-import-Rs8AX branch May 17, 2026 15:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

V_delta: drop redundant status fields; restore v1 session_in_a_dataset shape#44

V_delta: drop redundant status fields; restore v1 session_in_a_dataset shape#44
stevevanhooser merged 16 commits into
mainfrom
claude/did-matlab-v2-import-Rs8AX

stevevanhooser commented May 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

stevevanhooser commented May 15, 2026

Design point

Dropped fields

session_in_a_dataset restored to v1 shape

Verification

Projected impact on the discovery corpora (via did-matlab simulator)

Coordination

Out of scope (still pending)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

`session_in_a_dataset` restored to v1 shape