Fix: Prevent crash on cross-sectional data in get_correlation_matrix and update np.nan#391
Merged
contsili merged 2 commits intoamarquand:devfrom Mar 10, 2026
Conversation
Fix _get_correlation_matrix and add a check to it
contsili
approved these changes
Mar 10, 2026
Collaborator
contsili
left a comment
There was a problem hiding this comment.
thanks @divye-joshi !
@likeajumprope is also working on a new updated version of how thrivelines are computed. So I would suggest to not spent more time investigating the current implementation as it has some bugs/wrong implementations
Author
|
@contsili No problem! Thanks for letting me know about this. I am trying to understand normative velocity modeling, so I experimented with longitudinal data and encountered some bugs in thrivelines computation and plotting too. It's reassuring that it's been worked on. Do let me know if i can be of any help. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
This PR fixes two crashes that occur when a user attempts to compute a correlation matrix or plot thrivelines using cross-sectional data (such as the
fcon1000dataset used in the tutorials).Motivation and Context
Currently, if a user runs
model.compute_correlation_matrix(train)on purely cross-sectional data, the algorithm attempts to merge subjects across different ages. Since there are no overlappingsubject_ids, it creates an empty array. This eventually gets passed to scikit-learn'sLinearRegression.fit(), which triggers a cryptic crash:ValueError: Found array with 0 sample(s).Additionally, the code used
np.NaN, which was permanently removed in NumPy 2.0, causing an immediateAttributeErroron newer Python environments.Changes Made
get_correlation_matrixinpcntoolkit/math_functions/thrive.py. If the data lacks longitudinal measurements (no duplicatedsubject_ids), it now raises a clear, descriptiveValueErrorexplaining that longitudinal data is required.np.NaNwithnp.naninthrive.py.