Skip to content

Pfm Update with human subjects#392

Open
arkky68 wants to merge 16 commits into
Washington-University:PFMsfrom
arkky68:pfm-clean
Open

Pfm Update with human subjects#392
arkky68 wants to merge 16 commits into
Washington-University:PFMsfrom
arkky68:pfm-clean

Conversation

@arkky68

@arkky68 arkky68 commented Apr 10, 2026

Copy link
Copy Markdown

1 update the example script ( with human subject )
2 update with $ConcatName != "" situation
3 add fmri-resolution parameter

Comment thread Examples/Scripts/RunPFMpipelineFullRun.sh Outdated
Comment thread Examples/Scripts/RunPFMpipelineFullRun.sh Outdated
Comment thread PFM/scripts/PostPROFUMO.m Outdated
Comment thread PFM/PFMPipeline.sh Outdated
Comment thread PFM/PFMPipeline.sh Outdated
Comment thread PFM/PFMPipeline.sh Outdated
Comment thread Examples/Scripts/RunPFMpipelineFullRun.sh Outdated
@bqrosen bqrosen mentioned this pull request Apr 22, 2026
@glasserm

Copy link
Copy Markdown
Contributor

@arkky68 there are some comments here.

Comment thread PFM/scripts/WishartFilter.m Outdated

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New Wisharfilter.m creates

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is great that you've pulled this out of icaDIm and made it its own module. Can you put it in global/matlab/icaDim instead and then edit icaDim.m to call it. That way, if we ever want to tweak the Wishart filtering in the future, we won't need to make the edits in two places. The PFM scripts will still have access to it there.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I like this. We now have two code paths that do the same thing and would have to be maintained in parallel. I would prefer we didn't duplicate code like that. Why can't we use the existing icaDim for this?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If WishartFilter.m is put in global/matlab/icaDim/ and icaDim.m is edited to call it instead of an internal subfunction, then there will still only be one copy of the code and one wouldn't have to redo the dimensionality estimation if you had already assessed the dimensionality earlier. An alternative to functionalizing the Wishart filtering would be to add a dimOveride argument to icaDim.m, but I think separating the dimensionally estimation from the Wishart filtering (while maintaining a single copy) is cleaner.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would restructure so there's one shared WishartFilter.m in global/matlab/icaDim/ with the core filtering math. icaDim.m will call it instead of having the math inline. The PFM wrapper will be renamed to ApplyWishartFilterProfumo.m and also call WishartFilter.m for the usage. Then extract FitWishart, Smooth, and lpdist from the bottom of icaDim.m into standalone .m files in the same directory, so there's a single copy of everything shared by both code paths.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused here. We always want to estimate the Dim for Wishart Filtering. That is a per dataset value that is not predictable in advance. There are ways to short circuit icaDim (that are used for legacy data where it doesn't work as well), but I don't think we want to do that here.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, the calcDim from ICAFIX is for whole concat dataset, whereas we need the Wishart filter each run separately for PFMs, so the previously calculated Dim isn't usable here. That makes sense. You're right just need a wrapper around icaDim, not a separate Wishart filtering function.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be Wishart filtering the per-subject concat timeseries and then splitting back into runs?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I think we should match how sICA+FIX was run. That will produce the "stickiest" dimensionality estimation.

@arkky68

arkky68 commented May 8, 2026

Copy link
Copy Markdown
Author

I am wondering if we should separate PROFUMO postprocessing into its own pipeline step rather than keeping it inside RunPROFUMO. I've run into situations where postprocessing needs to be rerun separately.

arkky68 and others added 12 commits May 8, 2026 10:23
(cherry picked from commit af161ed)
Co-authored-by: Burke Rosen <burke.r@wustl.edu>
(cherry picked from commit 16152f6)
Co-authored-by: Burke Rosen <burke.r@wustl.edu>
(cherry picked from commit b1b7120)
(cherry picked from commit 06673e3)
Removed LowDims variable from RSN regression settings.

(cherry picked from commit caeb87c)
(cherry picked from commit 7754262)
Removed redundant cleanup of temporary Wishart filtered files from the PROFUMO step.

(cherry picked from commit fc2629d)
Comment thread Examples/Scripts/RunPFMpipelineFullRun.sh Outdated
Comment thread PFM/scripts/WishartFilter.sh Outdated
Comment thread PFM/scripts/WishartFilter.sh Outdated
Comment thread Examples/Scripts/RunPFMpipelineFullRun.sh
@bqrosen

bqrosen commented May 8, 2026

Copy link
Copy Markdown
Contributor

I am wondering if we should separate PROFUMO postprocessing into its own pipeline step rather than keeping it inside RunPROFUMO. I've run into situations where postprocessing needs to be rerun separately.

Assuming it execute quickly, I think it makes sense to move it from the RunPROFUMO to postPROFUMO which is very short. In the situations where you needed to run postprocessing separately, you also needed to rerun the postPROFUMO step, right?

Comment thread PFM/scripts/PostPROFUMO.m Outdated
Comment thread Examples/Scripts/RunPFMpipelineFullRun.sh Outdated
Comment thread PFM/scripts/WishartFilter.sh
Comment thread PFM/PFMPipeline.sh
log_Err_Abort "Reference image must be specified with --ref-image"
fi

ProfumoConfigToUse="${ProfumoConfig}"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it going to be necessary to create an entire copy of the fMRI timeseries on the file system to use this functionality?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there is a way to get around it, unless you count a RAMdisk. The real solution is to implement Wishart filtering In PROFUMO itself. Since PROFUMO already uses python for a function called estimate_smoothness.py, I would predict that it would be reasonably straightforward for Rezvan to incorporate it if we translate WishartFilter.m into python.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the question is if PROFUMO reads the data multiple times or not. If it doesn't, perhaps the Wishart Filtering could be "just in time" as it is for other code.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think saving an at least temporary copy to disk is always going to be needed if the Wishart filtering is done in matlab and the subsequent step like melodic or profumo is not.

Comment thread PFM/scripts/ApplyWFProfumo.m Outdated
for i = 1:length(inList)
cii = ciftiopen(strtrim(inList{i}), 'wb_command');
tpCounts(i) = size(cii.cdata, 2);
allData = [allData cii.cdata];

@glasserm glasserm May 26, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concatenating fMRI data is typically more complicated than this, including requiring demeaning, potentially detrending (if not already done) and variance normalizing. Would it be better to use the already concatenated data or stick with however sICA+FIX was run (use concatenated data if MR+FIX and not concatenated data otherwise)?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when a concat file exists (via --concat-name), we now use that directly instead of concatenating in MATLAB. When no concat file is available, each run is demeaned before concatenation and the mean is added back after filtering.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, there should be concatenated tICA cleaned fMRI data that we can use even for single run sICA+FIX runs. I wouldn't support unconcatenated data unless you are going to do it exactly how it is done in sICA+FIX.

Comment thread PFM/PFMPipeline.sh
ProfumoConfigToUse="${ProfumoConfig}"
if [[ "$NumWishart" -gt 0 ]]
then
WFDir="${PFMFolder}/WishartFilter_WF${NumWishart}"

@glasserm glasserm May 26, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interested in @coalsont's thoughts on this. We could make one copy of the WF data, but creating and deleting it multiple times might be even worse that making one copy and keeping it. If we make one copy, we probably want to put it in the HCP Pipelines main folder structure for each individual.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I have the context for this. If files are generated, used, and immediately deleted once per filename, that is the kindest to snapshots (assuming you don't otherwise want the files, and assuming they have to be written to a snapshotted filesystem). I'm not sure in what context (other than testing) you'd repeatedly run the PFM pipeline with the same wishart filtering settings (there is an option that can be specified to keep the wishart filtered file in that case), which is the only way you'd get the same folder name here (and therefore same absolute filename).

If the subject group is large, since none of them are deleted until after everything is done (probably unavoidable), the length of runtime does increase the chance that they will be "caught" by a daily (or longer) snapshot. But, as long as we don't want to keep them for other purposes, and don't regenerate and redelete them again, that is still preferable to letting them sit there through multiple long-term snapshots.

If we ever wanted to keep these subject-specific files long-term (not for testing purposes), we would normally put them somewhere in the subjects' normal folder.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Group PFMs probably requires all dtseries on disk while it is running over several days.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "--keep-wishart-files" argument is already intended for any cases where specifically PFM may be run again with the same wishart filtering, though it does require the user to notice it and realize its relevance to their plans and whether their storage has automatic snapshots (but it isn't clear to me whether other users will typically run PFM more than once).

If the context is "users may want wishart filtered data for other purposes (and possibly have already generated it)", that is at a broader scope than the PFM pipeline itself, and may suggest a global script and naming convention for just computing WF results, which would then be used by the PFM pipeline, instead of having its own internal WF-computing code and conventions for temporary files.

# Reference image for PROFUMO
RefImage="${StudyFolder}/$GroupAverageName/MNINonLinear/Results/Mac25Rhesus_v5_BOLD_REST_CONCAT_MIGP/Mac25Rhesus_v5_BOLD_REST_CONCAT_MIGP_Atlas_hppd2_clean_meanvn.dscalar.nii"
# set the PFM dimensionality
PFMdim="99"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you choose this number?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the dimensionality I used for HCP-YA, so i directly used it in the example file, or do you have a suggestion on this part, or just leave it blank?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For which type of data? We were using d=76 for 3T resting state.

Comment thread PFM/PFMPipeline.sh
Comment thread Examples/Scripts/RunPFMpipelineFullRun.sh Outdated
Comment thread PFM/PFMPipeline.sh Outdated
Comment thread Examples/Scripts/RunPFMpipelineFullRun.sh Outdated
@arkky68 arkky68 requested a review from bqrosen June 8, 2026 12:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants