This repository demonstrates how to add grouped cell type annotations to a Seurat RDS file output by the FUSION label transfer step, as a precursor to spot annotation. This is a demo codebase and not a production plugin.
label_transfer.R --> [THIS STEP] --> spot_annotation
Label transfer predicts fine-grained cell type labels (e.g. subclass.l2: PT-S1, aPT, cycPT) for each spot. This step maps those labels to broader groups that spot annotation will use. main_type is the primary group used for spot annotation, with cell_state, structure, and class available for additional levels of visualization and analysis.
Reads the integrated RDS from label transfer, joins a user-provided mapping CSV to the spot metadata, and saves a new RDS with four additional metadata columns: main_type, cell_state, structure, and class.
Rscript spot_annotation.R integrated.rds cell_type_mapping.csv
Output is saved as integrated_annotated.rds. Optionally specify a custom output path as a third argument:
Rscript spot_annotation.R integrated.rds cell_type_mapping.csv output.rds
The CSV has a single comment line at the top specifying which metadata column in the Seurat object holds the prediction labels, followed by a standard CSV table:
# metadata_col=subclass.l2
prediction_label,main_type,cell_state,structure,class
PT-S1,PT,reference,tubules,epithelial
PT-S2,PT,reference,tubules,epithelial
aPT,PT,altered,tubules,epithelial
cycPT,PT,cycling,tubules,epithelial
dPT,PT,degenerative,tubules,epithelial
| Column | Description |
|---|---|
| prediction_label | Must exactly match the unique values of the specified metadata column |
| main_type | Primary cell type group used for spot annotation |
| cell_state | Cell condition: reference, degenerative, cycling, altered, or transitioning |
| structure | Anatomical grouping e.g. tubules, glomeruli, vessels, interstitium |
| class | Broad biological class e.g. epithelial, endothelial, stroma, immune, neural |
See cell_type_mapping_example.csv for a complete example using the KPMP kidney atlas with subclass.l2 labels.
Change the metadata_col comment in your CSV from subclass.l2 to whatever metadata column your reference uses. To find the available metadata columns and their unique values in your reference, use:
library(hdf5r)
f <- H5File\$new(\"your_reference.h5Seurat\", mode = \"r\")
meta <- f[[\"meta.data\"]]
print(names(meta))