ImmunoProfileSpatial

This repository accompanies the paper "Graph neural network modeling of spatial tumor-immune interactions identifies prognostic cellular niches in non-small cell lung cancer" (npj Precision Oncology, 2026. https://doi.org/10.1038/s41698-026-01314-3). It provides code for constructing spatial graphs from multiplex immunofluorescence (mIF) data acquired as part of the ImmunoProfile project at DFCI and training a graph neural network (GNN) to predict patient survival based on localized tumor–immune interactions (based on the SPACE-GM framework by Wu et al.

Data link: https://doi.org/10.7303/syn52596661

Setup

The space-gm directory is a submodule that contains the GNN implementation used in our experiments. Clone or install it separately if it is not present.

Create the conda environment listed in immunoprofilespatialenv.yml:

conda env create -f immunoprofilespatialenv.yml

Workflow

Preprocess ROIs: Use scripts/preprocessing.py and scripts/roi_qc.py to convert raw single-cell CSV files into graphs (Voronoi polygons, Delaunay triangulation) and to perform ROI quality control.

Create dataset splits: scripts/create_dataset_splits.py generates training/validation/test splits at the case or ROI level. Example splits are provided in data/experiment_split.json.

Generate graph/subgraph datasets: Run scripts/generate_subgraph_dataset.py to build full graphs and spatial neighborhood subgraphs for each split. Paths and dataset parameters are controlled via JSON files in configs/.

Train the model: scripts/train_model.py trains a SPACE‑GM model on the generated subgraphs. Training parameters (batch size, learning rate, etc.) are defined in configs/train_params.json.

Evaluation: Model outputs include per-neighborhood survival predictions that can be aggregated at the patient level. Example evaluation utilities are provided in scripts/utils_spatial.py.

Analysis: Subgraph manipulations described in the paper are implemented as transformations and can be found in additional_transforms.py. Visualization utilities can be found in utils_spatial.py.

Note: parts of the preprocessing pipeline and the training scripts may still be incomplete. We will update the repository as additional code becomes available.

Data

The dataset used in this paper is derived from the publicly available ImmunoProfile project on Synapse (syn52596661). It is not included in this repository, but can be freely downloaded from Synapse after account registration.

Downloading the data

This paper uses the NSCLC subset of the pan-cancer ImmunoProfile dataset. Two files are needed:

1. Metadata file
Available as a Synapse table.
Filter to NSCLC patients: oncotree_metamain == 'Non-Small Cell Lung Cancer'.

2. Single-cell parquet file
Available as a parquet file. Download single_cells.parquet from Synapse.
This file contains ~39 million spatially-resolved cells across all cancer types. Filter to the case_id values from the NSCLC metadata, keeping only region_label == 'InnerTumor' cells.

Data format expected by the pipeline

After filtering, the single-cell data should have the following columns, which map directly to the graph construction inputs:

Column	Type	Description
`case_id`	int64	De-identified patient ID
`roi_id`	object	ROI identifier
`region_label`	object	segmented region identifier
`cell_x`	int64	Cell centroid x
`cell_y`	int64	Cell centroid y
`cd8`	bool	CD8 marker positivity
`pd1`	bool	PD-1 marker positivity
`pdl1`	bool	PD-L1 marker positivity
`foxp3`	bool	FOXP3 marker positivity
`tumor`	bool	Cytokeratin (tumor) marker positivity

The data/ directory in this repository contains minimal placeholder files illustrating the expected formats:

full_graph_labels.csv — survival labels per ROI
markers.json — cell type and feature definitions
experiment_split.json — the exact train/validation/test split used in the paper (305/72/129 patients, stratified by survival status)

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
configs		configs
data		data
scripts		scripts
space-gm @ ae2b9c5		space-gm @ ae2b9c5
.gitmodules		.gitmodules
README.md		README.md
immunoprofilespatialenv.yml		immunoprofilespatialenv.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ImmunoProfileSpatial

Setup

Workflow

Data

Downloading the data

Data format expected by the pipeline

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ImmunoProfileSpatial

Setup

Workflow

Data

Downloading the data

Data format expected by the pipeline

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages