FUSION 2.0 — Cell-Level Gene Expression Overlay

This repository contains script(s) to prepare and visualize per-cell gene expression data from Xenium Ranger output overlaid on co-registered H&E whole slide images (WSI) in the FUSION 2.0 pipeline.

It is unlikely this codebase will actually be used in a future plug-in, but it is being maintained to record the required files, where they come from, and how they are processed to create data structured for visualization.

Overview

The goal is to visualize per-cell gene expression overlaid on the co-registered H&E WSI, providing a tissue-level view of spatial expression patterns for user-selected genes — analogous to the cell-level aggregated expression view in Xenium Explorer.

Input Files

Both files come directly from Xenium Ranger output:

File	Description
`cell_feature_matrix.h5`	Cell × gene expression matrix (raw UMI counts)
`cells.parquet`	Cell centroid coordinates in Xenium space (x, y)

Expression Processing

CPK normalization — counts per 10,000 per cell
log1p transformation — applied to normalized counts
Percentile clipping — per gene, clipped to 1st–99th percentile to prevent outlier cells from collapsing the color range

Color Mapping

Normalized, clipped expression values are mapped to the Inferno color scale (black = lowest expression, yellow/white = highest), consistent with the Xenium Explorer aesthetic, rendered as an overlay on the H&E WSI with adjustable opacity.

Gene Selection

For the initial implementation, users select a single gene at a time for visualization. A planned future feature will support module scores across up to 20 genes, where per-cell expression values are aggregated into a single composite score before color mapping.

Scalability

The test dataset was the 5K Prime Gene Panel + 100, making a full cell × gene CSV impractical — the dense matrix exceeds memory limits even on high-memory compute nodes. Two approaches are supported:

Curated gene list — a hardcoded set of biologically relevant genes (see below)
Top N genes by coefficient of variation (CV) — recommended for automated selection; high-CV genes capture the most spatially variable expression patterns across the tissue. A reasonable default is top 100–500 genes by CV computed after CPK normalization, with a configurable maximum.

Coordinate Transformation

Cell centroids from cells.parquet are in native Xenium space. The existing tf_mat transformation matrix from the FUSION 2.0 co-registration pipeline is applied at render time to map cell positions into WSI space. This is handled by the FUSION 2.0 pipeline and not by the scripts in this repository.

Test File

A test CSV was generated from Xenium Ranger output for sample D450 (pediatric kidney) using a curated set of 13 genes:

Kidney cell type markers: SLC12A1, AQP1, AQP2, PECAM1, UMOD, LRP2, CUBN, VCAM1

Housekeeping genes: HPRT1, SDHA, TBP, YWHAZ, PGK1

The output CSV is structured as cells × genes with columns cell_id, x_centroid, y_centroid, followed by one column per gene containing normalized, clipped expression values.

Repository Structure

.
├── data/               # Input and output data files (not tracked by git)
└── scripts/
    └── xenium-output-to-gene-expr-csv.R   # Preprocessing script

Usage

Rscript scripts/xenium-output-to-gene-expr-csv.R

Requirements

R 4.x
hdf5r, arrow, Matrix, dplyr

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
scripts		scripts
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FUSION 2.0 — Cell-Level Gene Expression Overlay

Overview

Input Files

Expression Processing

Color Mapping

Gene Selection

Scalability

Coordinate Transformation

Test File

Repository Structure

Usage

Requirements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

FUSION 2.0 — Cell-Level Gene Expression Overlay

Overview

Input Files

Expression Processing

Color Mapping

Gene Selection

Scalability

Coordinate Transformation

Test File

Repository Structure

Usage

Requirements

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages