Skip to content

Fix: Prevent crash on cross-sectional data in get_correlation_matrix and update np.nan#391

Merged
contsili merged 2 commits intoamarquand:devfrom
divye-joshi:fix-compute-correlation-matrix
Mar 10, 2026
Merged

Fix: Prevent crash on cross-sectional data in get_correlation_matrix and update np.nan#391
contsili merged 2 commits intoamarquand:devfrom
divye-joshi:fix-compute-correlation-matrix

Conversation

@divye-joshi
Copy link

@divye-joshi divye-joshi commented Mar 10, 2026

Description

This PR fixes two crashes that occur when a user attempts to compute a correlation matrix or plot thrivelines using cross-sectional data (such as the fcon1000 dataset used in the tutorials).

Motivation and Context

Currently, if a user runs model.compute_correlation_matrix(train) on purely cross-sectional data, the algorithm attempts to merge subjects across different ages. Since there are no overlapping subject_ids, it creates an empty array. This eventually gets passed to scikit-learn's LinearRegression.fit(), which triggers a cryptic crash: ValueError: Found array with 0 sample(s).

Additionally, the code used np.NaN, which was permanently removed in NumPy 2.0, causing an immediate AttributeError on newer Python environments.

Changes Made

  • Added a boundary check: Inserted a check at the beginning of get_correlation_matrix in pcntoolkit/math_functions/thrive.py. If the data lacks longitudinal measurements (no duplicated subject_ids), it now raises a clear, descriptive ValueError explaining that longitudinal data is required.
  • NumPy 2.0 Compatibility: Replaced the outdated np.NaN with np.nan in thrive.py.
image

Fix _get_correlation_matrix and add a check to it
Copy link
Collaborator

@contsili contsili left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @divye-joshi !

@likeajumprope is also working on a new updated version of how thrivelines are computed. So I would suggest to not spent more time investigating the current implementation as it has some bugs/wrong implementations

@contsili contsili merged commit f715ad6 into amarquand:dev Mar 10, 2026
2 checks passed
@divye-joshi
Copy link
Author

@contsili No problem! Thanks for letting me know about this.

I am trying to understand normative velocity modeling, so I experimented with longitudinal data and encountered some bugs in thrivelines computation and plotting too. It's reassuring that it's been worked on. Do let me know if i can be of any help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants