Pfm Update with human subjects#392
Conversation
|
@arkky68 there are some comments here. |
There was a problem hiding this comment.
It is great that you've pulled this out of icaDIm and made it its own module. Can you put it in global/matlab/icaDim instead and then edit icaDim.m to call it. That way, if we ever want to tweak the Wishart filtering in the future, we won't need to make the edits in two places. The PFM scripts will still have access to it there.
There was a problem hiding this comment.
I'm not sure I like this. We now have two code paths that do the same thing and would have to be maintained in parallel. I would prefer we didn't duplicate code like that. Why can't we use the existing icaDim for this?
There was a problem hiding this comment.
If WishartFilter.m is put in global/matlab/icaDim/ and icaDim.m is edited to call it instead of an internal subfunction, then there will still only be one copy of the code and one wouldn't have to redo the dimensionality estimation if you had already assessed the dimensionality earlier. An alternative to functionalizing the Wishart filtering would be to add a dimOveride argument to icaDim.m, but I think separating the dimensionally estimation from the Wishart filtering (while maintaining a single copy) is cleaner.
There was a problem hiding this comment.
I would restructure so there's one shared WishartFilter.m in global/matlab/icaDim/ with the core filtering math. icaDim.m will call it instead of having the math inline. The PFM wrapper will be renamed to ApplyWishartFilterProfumo.m and also call WishartFilter.m for the usage. Then extract FitWishart, Smooth, and lpdist from the bottom of icaDim.m into standalone .m files in the same directory, so there's a single copy of everything shared by both code paths.
There was a problem hiding this comment.
I'm a bit confused here. We always want to estimate the Dim for Wishart Filtering. That is a per dataset value that is not predictable in advance. There are ways to short circuit icaDim (that are used for legacy data where it doesn't work as well), but I don't think we want to do that here.
There was a problem hiding this comment.
Ah, the calcDim from ICAFIX is for whole concat dataset, whereas we need the Wishart filter each run separately for PFMs, so the previously calculated Dim isn't usable here. That makes sense. You're right just need a wrapper around icaDim, not a separate Wishart filtering function.
There was a problem hiding this comment.
Should we be Wishart filtering the per-subject concat timeseries and then splitting back into runs?
There was a problem hiding this comment.
Actually I think we should match how sICA+FIX was run. That will produce the "stickiest" dimensionality estimation.
|
I am wondering if we should separate PROFUMO postprocessing into its own pipeline step rather than keeping it inside RunPROFUMO. I've run into situations where postprocessing needs to be rerun separately. |
(cherry picked from commit af161ed)
(cherry picked from commit f4f0e7b)
Co-authored-by: Burke Rosen <burke.r@wustl.edu> (cherry picked from commit 16152f6)
Co-authored-by: Burke Rosen <burke.r@wustl.edu> (cherry picked from commit b1b7120)
(cherry picked from commit 06673e3)
(cherry picked from commit 28d1c02)
(cherry picked from commit 7842a4e)
(cherry picked from commit 82cf392)
(cherry picked from commit ba11f0c)
Removed LowDims variable from RSN regression settings. (cherry picked from commit caeb87c)
(cherry picked from commit 7754262)
Removed redundant cleanup of temporary Wishart filtered files from the PROFUMO step. (cherry picked from commit fc2629d)
Assuming it execute quickly, I think it makes sense to move it from the RunPROFUMO to postPROFUMO which is very short. In the situations where you needed to run postprocessing separately, you also needed to rerun the postPROFUMO step, right? |
| log_Err_Abort "Reference image must be specified with --ref-image" | ||
| fi | ||
|
|
||
| ProfumoConfigToUse="${ProfumoConfig}" |
There was a problem hiding this comment.
Is it going to be necessary to create an entire copy of the fMRI timeseries on the file system to use this functionality?
There was a problem hiding this comment.
I don't think there is a way to get around it, unless you count a RAMdisk. The real solution is to implement Wishart filtering In PROFUMO itself. Since PROFUMO already uses python for a function called estimate_smoothness.py, I would predict that it would be reasonably straightforward for Rezvan to incorporate it if we translate WishartFilter.m into python.
There was a problem hiding this comment.
I guess the question is if PROFUMO reads the data multiple times or not. If it doesn't, perhaps the Wishart Filtering could be "just in time" as it is for other code.
There was a problem hiding this comment.
I think saving an at least temporary copy to disk is always going to be needed if the Wishart filtering is done in matlab and the subsequent step like melodic or profumo is not.
… use RegString, fix size(runTCS,1))
| for i = 1:length(inList) | ||
| cii = ciftiopen(strtrim(inList{i}), 'wb_command'); | ||
| tpCounts(i) = size(cii.cdata, 2); | ||
| allData = [allData cii.cdata]; |
There was a problem hiding this comment.
Concatenating fMRI data is typically more complicated than this, including requiring demeaning, potentially detrending (if not already done) and variance normalizing. Would it be better to use the already concatenated data or stick with however sICA+FIX was run (use concatenated data if MR+FIX and not concatenated data otherwise)?
There was a problem hiding this comment.
when a concat file exists (via --concat-name), we now use that directly instead of concatenating in MATLAB. When no concat file is available, each run is demeaned before concatenation and the mean is added back after filtering.
There was a problem hiding this comment.
Actually, there should be concatenated tICA cleaned fMRI data that we can use even for single run sICA+FIX runs. I wouldn't support unconcatenated data unless you are going to do it exactly how it is done in sICA+FIX.
| ProfumoConfigToUse="${ProfumoConfig}" | ||
| if [[ "$NumWishart" -gt 0 ]] | ||
| then | ||
| WFDir="${PFMFolder}/WishartFilter_WF${NumWishart}" |
There was a problem hiding this comment.
Interested in @coalsont's thoughts on this. We could make one copy of the WF data, but creating and deleting it multiple times might be even worse that making one copy and keeping it. If we make one copy, we probably want to put it in the HCP Pipelines main folder structure for each individual.
There was a problem hiding this comment.
I'm not sure I have the context for this. If files are generated, used, and immediately deleted once per filename, that is the kindest to snapshots (assuming you don't otherwise want the files, and assuming they have to be written to a snapshotted filesystem). I'm not sure in what context (other than testing) you'd repeatedly run the PFM pipeline with the same wishart filtering settings (there is an option that can be specified to keep the wishart filtered file in that case), which is the only way you'd get the same folder name here (and therefore same absolute filename).
If the subject group is large, since none of them are deleted until after everything is done (probably unavoidable), the length of runtime does increase the chance that they will be "caught" by a daily (or longer) snapshot. But, as long as we don't want to keep them for other purposes, and don't regenerate and redelete them again, that is still preferable to letting them sit there through multiple long-term snapshots.
If we ever wanted to keep these subject-specific files long-term (not for testing purposes), we would normally put them somewhere in the subjects' normal folder.
There was a problem hiding this comment.
Group PFMs probably requires all dtseries on disk while it is running over several days.
There was a problem hiding this comment.
The "--keep-wishart-files" argument is already intended for any cases where specifically PFM may be run again with the same wishart filtering, though it does require the user to notice it and realize its relevance to their plans and whether their storage has automatic snapshots (but it isn't clear to me whether other users will typically run PFM more than once).
If the context is "users may want wishart filtered data for other purposes (and possibly have already generated it)", that is at a broader scope than the PFM pipeline itself, and may suggest a global script and naming convention for just computing WF results, which would then be used by the PFM pipeline, instead of having its own internal WF-computing code and conventions for temporary files.
| # Reference image for PROFUMO | ||
| RefImage="${StudyFolder}/$GroupAverageName/MNINonLinear/Results/Mac25Rhesus_v5_BOLD_REST_CONCAT_MIGP/Mac25Rhesus_v5_BOLD_REST_CONCAT_MIGP_Atlas_hppd2_clean_meanvn.dscalar.nii" | ||
| # set the PFM dimensionality | ||
| PFMdim="99" |
There was a problem hiding this comment.
Why did you choose this number?
There was a problem hiding this comment.
This is the dimensionality I used for HCP-YA, so i directly used it in the example file, or do you have a suggestion on this part, or just leave it blank?
There was a problem hiding this comment.
For which type of data? We were using d=76 for 3T resting state.
…gString, remove duplicates
1 update the example script ( with human subject )
2 update with $ConcatName != "" situation
3 add fmri-resolution parameter