Skip to content

MRI-Lab-Graz/ModelArray

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

51 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🧠 ModelArray Processing Pipeline

Automated voxelwise statistical analysis using ModelArray, optimized for neuroimaging studies. Runs inside a Singularity container.


πŸš€ Quick Start

# Run a complete modality analysis (e.g., all NODDI or MAPMRI scalars)
./run_analysis.sh configs/pipeline_noddi_gam_2mm.json

# Run in background with nohup
./run_analysis.sh --nohup configs/pipeline_mapmri_wholebrain_2group_gam_2mm.json

# Dry-run to preview commands without executing
./run_analysis.sh --dry-run configs/pipeline_noddi_gam_2mm.json

# Run only one scalar from a modality config
./run_analysis.sh --scalar icvf_dwimap configs/pipeline_noddi_gam_2mm.json

# Skip the ML step (statistics only)
./run_analysis.sh --skip-ml configs/pipeline_noddi_gam_2mm.json

# Skip registration (data already in MNI space)
./run_analysis.sh --skip-reg configs/pipeline_noddi_gam_2mm.json

# Force regeneration of all intermediate files
./run_analysis.sh --force configs/pipeline_noddi_gam_2mm.json

# Combine options
./run_analysis.sh --nohup --skip-ml --scalar od_dwimap configs/pipeline_noddi_gam_2mm.json

Real-World Example Output

$ ./run_analysis.sh --dry-run configs/pipeline_noddi_gam_2mm.json

INFO: Detected modality-level config: noddi (3 scalars)
[15:00:00] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
[15:00:00] ModelArray Analysis Pipeline
[15:00:00] Config: pipeline_noddi_gam_2mm
[15:00:00] Type: modality
[15:00:00] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
[15:00:02] Using scalars: icvf_dwimap od_dwimap isovf_dwimap
[15:00:02] Scalar 1/3: icvf_dwimap
[15:00:02] Generated config: .../configs/generated/icvf_dwimap.json
[15:00:02] bash .../ModelArray/_run_scalar_pipeline.sh --dry-run ...
[15:00:02] Scalar 1/3 complete: icvf_dwimap in 00:00:00 | overall 33% in 00:00:00
...
[15:00:03] All scalar pipelines completed.

πŸ“š Table of Contents

  1. Quick Start
  2. Pipeline Overview
  3. Config Types
  4. run_analysis.sh Options
  5. Individual Scripts
  6. Requirements

πŸ”„ Pipeline Overview

run_analysis.sh (unified entry point)
         β”‚
         β”œβ”€β”€ Modality config? ──► _run_modality_batch.sh (loops scalars)
         β”‚                                β”‚
         β”‚                                β–Ό (per scalar)
         └── Scalar config? ────► _run_scalar_pipeline.sh
                                          β”‚
         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
         β–Ό
    A. 0_register_acpc_to_mni.sh  (optional)
         β”‚
         β–Ό
    B. Group mask preparation
         β”‚
         β–Ό
    C. 1_generate_cohort*.sh
         β”‚
         β–Ό
    D. 2_run_convoxel.sh  β†’  cohort_*.h5
         β”‚
         β–Ό
    E. 3_run_model.sh  β†’  NIfTI stats + CSV summaries
         β”‚
         β–Ό
    F. 4_run_ml.py  β†’  ML metrics + predictions (optional)

πŸ“‹ Config Types

Modality Config (batch mode)

Processes multiple scalars for a modality (NODDI, MAPMRI, etc.):

{
  "dataset": { "bids_dir": "...", "output_dir": "..." },
  "modality": { "name": "noddi", "scalars": ["icvf_dwimap", "od_dwimap"] },
  "statistics": { "model_type": "gam", "formula_template": "..." }
}

Scalar Config (single mode)

Processes one scalar through the full pipeline:

{
  "data_dir": "...",
  "csv_file": "cohort_icvf_dwimap.csv",
  "h5_file": "cohort_icvf_dwimap.h5",
  "formula": "icvf_dwimap ~ s(delta_age) + group + sex"
}

βš™οΈ run_analysis.sh Options

Option Description
--dry-run Print commands without executing
--nohup Run detached in background (logs to output_dir or /tmp)
--scalar NAME Run only this scalar (modality configs only)
--skip-reg Skip registration step
--skip-ml Skip machine learning step
--force Regenerate all intermediate files
-h, --help Show help message

πŸ“œ Individual Scripts

1_generate_cohort.sh / 1_generate_cohort_longitudinal.sh

Creates a cohort CSV from participant list, images, and masks.

./1_generate_cohort_longitudinal.sh -p participants.tsv -d scalar_dir -m mask_dir -o output_dir

2_run_convoxel.sh

Extracts voxel data into HDF5 format.

./2_run_convoxel.sh -c cohort_icvf.csv -g group_mask.nii.gz

3_run_model.sh

Runs GAM/LM model fitting from a JSON config.

./3_run_model.sh config.json

The config maps directly to the underlying ModelArray call. num_subj_lthr_abs and num_subj_lthr_rel correspond to the voxelwise subject-threshold behavior documented in the ModelArray reference. An optional element_subset JSON array is also supported for smoke tests or chunked runs.

4_run_ml.py

Runs an optional pattern-recognition stage directly on the existing ModelArray HDF5 matrix (scalars/<scalar>/values) plus cohort labels from CSV.

Supported tasks:

  • classification
  • regression

Supported models:

  • random_forest
  • svm_rbf
  • elastic_net
  • xgboost (optional dependency)

This stage is automatically invoked by run_analysis.sh when ml.enabled is true (unless --skip-ml is passed).

Example scalar-level JSON block:

"ml": {
   "enabled": true,
   "task": "classification",
   "target_column": "group",
   "group_column": "participant",
   "id_columns": ["participant", "session"],
   "n_splits": 5,
   "random_seed": 42,
   "models": ["random_forest", "svm_rbf", "elastic_net"],
   "output_dir": "results/icvf_gam_2mm/ml"
}

For modality configs, place the same block under statistics.ml.


βš™οΈ Requirements

  • Bash
  • jq (for JSON parsing)
  • mrinfo (from MRtrix3)
  • singularity
  • A valid modelarray_confixel_0.1.5.sif container file

πŸ“ Repo Notes

Execution Strategy (Current)

  • Keep core ModelArray steps containerized for reproducibility:
    • 2_run_convoxel.sh
    • 3_run_model.sh
    • volumestats_write export
  • Keep orchestration and lightweight prep on host:
    • JSON orchestration wrappers
    • cohort table generation
    • registration/linking helpers

Pattern-Recognition / ML Plan

  • Use the existing ModelArray HDF5 (.h5) outputs as ML input features.
  • First prototype ML stage locally on host for rapid iteration.
  • If local prototype is stable and useful, build a project-specific container for the ML stage.
  • Goal state: default ML execution in container, with optional local mode for debugging.

Why This Policy

  • Fast iteration during method development.
  • Reproducible production runs once methods are finalized.
  • Clear separation between experimental and production workflows.

πŸ“‚ Folder Structure

Expected organization of data:

project/
β”œβ”€β”€ participants.tsv
β”œβ”€β”€ group_mask.nii.gz
β”œβ”€β”€ cohort_FA.csv
β”œβ”€β”€ FA.h5
β”œβ”€β”€ voxelwise_FA_stats_summary.csv
β”œβ”€β”€ subject1/
β”‚   β”œβ”€β”€ subject1_FA.nii.gz
β”‚   └── subject1_mask.nii.gz
└── ...

πŸ“„ License

MIT License


πŸ‘₯ Authors

  • Karl Koschutnig MRI-Lab Graz
  • Contributions welcome!

About

Using R for voxel (and fixel) data

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors