V_delta: relax mustBeScalar/mustNotHaveNaN flags to match v1 corpus shapes#47
Merged
Merged
Conversation
…hapes Two flags surfaced by closing the divergence between the Python simulator (which previously skipped mustBeScalar / mustNotHaveNaN) and did-matlab's V_delta validator (which enforces both). - stimulus_presentation.stimuli: mustBeScalar true -> false. v1 ships this as either a scalar struct (one stimulus configuration) or a struct array (e.g. a 225-element array for a Hartley basis). Across the 6 v1 corpora, 2462 docs ship the array form. The scalar-only declaration was incorrect. - stimulus_tuningcurve.stimulus_presentation_number: mustNotHaveNaN true -> false. v1 uses NaN as the missing-trial sentinel in this (N_trials x N_stimuli) index matrix. Soph has 60 such docs. Documentation now states that explicitly. Verification: Python simulator (now also enforcing mustBeScalar + mustNotHaveNaN, mirroring cache.m) reports 0 quarantines across all 6 v1 corpora (PRED 14, 20211116 1220, B 12917, Dab 27561, JH 78688, Soph 101427 = 221827 docs total). pytest 96/96.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Two flags loosened to match what v1 corpora actually ship. Surfaced by closing a divergence between the Python conversion simulator (previously skipped
mustBeScalar/mustNotHaveNaNentirely) and did-matlab's V_delta validator (which enforces both).Changes
stimulus_presentationstimulimustBeScalarstimulus_tuningcurvestimulus_presentation_numbermustNotHaveNaNWhy
stimulus_presentation.stimuli— v1 ships this as either a scalar struct (one stimulus configuration) or a struct array (one entry per presented stimulus, e.g. a 225-element array for a Hartley basis). Across the 6 v1 corpora the array form appears in 2,462 documents, distributed:The previous
mustBeScalar: truewas an aspirational tightening that no real v1 corpus respects. Documentation updated to call out the array form explicitly.stimulus_tuningcurve.stimulus_presentation_number— v1 usesNaNas the missing-trial sentinel in this(N_trials x N_stimuli)index matrix. Concrete example from Soph (4126933d3e1418d3_40b1714033d71086.json):60 documents in Soph carry NaN sentinels. Documentation updated to state explicitly that NaN marks the absent-presentation slot.
Verification
Python conversion simulator was tightened in lockstep to enforce
mustBeScalar,mustNotHaveNaN, andmaxLengthconstraints (previously it only checkedmustBeNonEmpty+enum). Under the strict simulator, all 6 v1 corpora migrate cleanly:pytest tests/96/96 passing.Pairs with
did-matlab
V2already has the matching changes:2e5684c(string-type accepts empty array sentinel) addresses the relatedresponse_units: []reports that came from the sametestCorpus20211116discovery run; PR #130 already merged.Out of scope
The simulator is a
/tmpdeveloper helper, not part of either repo's tree. Codifying it as a shared reference implementation between did-matlab and did-schema is a separate (worthwhile) follow-up.https://claude.ai/code/session_011wtV7T1TKrxbGBeW71ebQn
Generated by Claude Code