Skip to content

Add datasetSummary and compareDatasetSummary utilities#44

Merged
stevevanhooser merged 3 commits into
mainfrom
claude/add-dataset-summary-utils-OHtcg
Mar 21, 2026
Merged

Add datasetSummary and compareDatasetSummary utilities#44
stevevanhooser merged 3 commits into
mainfrom
claude/add-dataset-summary-utils-OHtcg

Conversation

@stevevanhooser
Copy link
Copy Markdown
Contributor

Summary

Add two new utility functions for dataset-level symmetry testing: datasetSummary creates a summary structure of an NDI Dataset object, and compareDatasetSummary compares two dataset summaries and returns a report of differences.

Key Changes

  • New module dataset_summary.py: Implements datasetSummary() function that creates a summary dict containing numSessions, references, sessionIds, and sessionSummaries for an ndi.dataset.Dataset object
  • New module compare_dataset_summary.py: Implements compareDatasetSummary() function that compares two dataset summaries and returns a list of human-readable difference strings, with support for excluding specific files and fields
  • Updated ndi/util/__init__.py: Exports the new datasetSummary and compareDatasetSummary functions
  • Updated MATLAB-Python bridge YAML: Added entries for both new functions documenting their MATLAB equivalents and signatures
  • Comprehensive test suite: Added test_dataset_summary.py with 13 test cases covering empty datasets, single/multiple sessions, field validation, and comparison logic including session ID matching
  • Refactored symmetry tests: Simplified dataset-level tests in test_build_dataset.py and test_download_ingested.py to use the new dataset-level utilities instead of manually iterating through sessions
  • Removed helper function: Deleted _dataset_summary() helper from test_build_dataset.py in favor of the public utility

Implementation Details

  • compareDatasetSummary delegates session-level comparisons to the existing compareSessionSummary function
  • Session summaries are matched by sessionId rather than index order, allowing for flexible comparison even when session order differs
  • Both functions support excludeFiles and excludeFields parameters for flexible comparison scenarios
  • The implementation maintains exact parity with MATLAB equivalents as documented in the bridge YAML

https://claude.ai/code/session_01EctVW1VcbY2LzAdfZrGBB5

claude added 3 commits March 21, 2026 16:54
Extract dataset summary logic from symmetry tests into public
ndi.util functions (mirroring MATLAB's ndi.util.datasetSummary and
ndi.util.compareDatasetSummary). Simplify symmetry tests to use
the new utilities instead of inline summary building and comparison.
Add 14 unit tests covering both functions.

https://claude.ai/code/session_01EctVW1VcbY2LzAdfZrGBB5
@stevevanhooser stevevanhooser merged commit 4229edd into main Mar 21, 2026
5 checks passed
@stevevanhooser stevevanhooser deleted the claude/add-dataset-summary-utils-OHtcg branch March 21, 2026 17:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants