V_delta: relax mustBeScalar/mustNotHaveNaN flags to match v1 corpus shapes by stevevanhooser · Pull Request #47 · Waltham-Data-Science/DID-schema

stevevanhooser · 2026-05-17T17:16:46Z

Two flags loosened to match what v1 corpora actually ship. Surfaced by closing a divergence between the Python conversion simulator (previously skipped mustBeScalar / mustNotHaveNaN entirely) and did-matlab's V_delta validator (which enforces both).

Changes

schema	field	flag	from	to
`stimulus_presentation`	`stimuli`	`mustBeScalar`	true	false
`stimulus_tuningcurve`	`stimulus_presentation_number`	`mustNotHaveNaN`	true	false

Why

stimulus_presentation.stimuli — v1 ships this as either a scalar struct (one stimulus configuration) or a struct array (one entry per presented stimulus, e.g. a 225-element array for a Hartley basis). Across the 6 v1 corpora the array form appears in 2,462 documents, distributed:

corpus	array-form stimuli docs
20211116	1
B	1,113
Dab	1,113
Soph	175

The previous mustBeScalar: true was an aspirational tightening that no real v1 corpus respects. Documentation updated to call out the array form explicitly.

stimulus_tuningcurve.stimulus_presentation_number — v1 uses NaN as the missing-trial sentinel in this (N_trials x N_stimuli) index matrix. Concrete example from Soph (4126933d3e1418d3_40b1714033d71086.json):

[..., [149, NaN, 148, NaN, NaN, 159, NaN]]

60 documents in Soph carry NaN sentinels. Documentation updated to state explicitly that NaN marks the absent-presentation slot.

Verification

Python conversion simulator was tightened in lockstep to enforce mustBeScalar, mustNotHaveNaN, and maxLength constraints (previously it only checked mustBeNonEmpty + enum). Under the strict simulator, all 6 v1 corpora migrate cleanly:

corpus	total	migrated
PRED	14	14
20211116	1,220	1,220
B	12,917	12,917
Dab	27,561	27,561
JH	78,688	78,688
Soph	101,427	101,427
total	221,827	221,827

pytest tests/ 96/96 passing.

Pairs with

did-matlab V2 already has the matching changes: 2e5684c (string-type accepts empty array sentinel) addresses the related response_units: [] reports that came from the same testCorpus20211116 discovery run; PR #130 already merged.

Out of scope

The simulator is a /tmp developer helper, not part of either repo's tree. Codifying it as a shared reference implementation between did-matlab and did-schema is a separate (worthwhile) follow-up.

https://claude.ai/code/session_011wtV7T1TKrxbGBeW71ebQn

Generated by Claude Code

…hapes Two flags surfaced by closing the divergence between the Python simulator (which previously skipped mustBeScalar / mustNotHaveNaN) and did-matlab's V_delta validator (which enforces both). - stimulus_presentation.stimuli: mustBeScalar true -> false. v1 ships this as either a scalar struct (one stimulus configuration) or a struct array (e.g. a 225-element array for a Hartley basis). Across the 6 v1 corpora, 2462 docs ship the array form. The scalar-only declaration was incorrect. - stimulus_tuningcurve.stimulus_presentation_number: mustNotHaveNaN true -> false. v1 uses NaN as the missing-trial sentinel in this (N_trials x N_stimuli) index matrix. Soph has 60 such docs. Documentation now states that explicitly. Verification: Python simulator (now also enforcing mustBeScalar + mustNotHaveNaN, mirroring cache.m) reports 0 quarantines across all 6 v1 corpora (PRED 14, 20211116 1220, B 12917, Dab 27561, JH 78688, Soph 101427 = 221827 docs total). pytest 96/96.

stevevanhooser merged commit c6ae52f into main May 17, 2026
4 checks passed

stevevanhooser deleted the claude/did-matlab-v2-import-Rs8AX branch May 17, 2026 18:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

V_delta: relax mustBeScalar/mustNotHaveNaN flags to match v1 corpus shapes#47

V_delta: relax mustBeScalar/mustNotHaveNaN flags to match v1 corpus shapes#47
stevevanhooser merged 1 commit into
mainfrom
claude/did-matlab-v2-import-Rs8AX

stevevanhooser commented May 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

stevevanhooser commented May 17, 2026

Changes

Why

Verification

Pairs with

Out of scope

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants