Skip to content

computeCommunProb: Matrix::crossprod creates N×N dense intermediate matrix causing OOM on large spatial datasets #4

@gettygugu-dot

Description

@gettygugu-dot

In computeCommunProb (modeling.R ~L193-276), each LR pair computes:

dataLR <- Matrix::crossprod(x_, y_)  # N×N dense matrix
P1_Pspatial <- HillFunction(dataLR) * P.spatial

For large spatial datasets (e.g., Visium HD with 174K cells), crossprod produces an N×N matrix (~0.9–8 GB per LR pair depending on gene sparsity). With parallel workers, peak memory easily exceeds 64 GB, causing OOM or segfault.

However, P.spatial is extremely sparse (e.g., 15.6M / 30.5B = 0.05% nonzero). Over 99.95% of the crossprod result is multiplied by zero and discarded.

Proposed fix: Pre-extract P.spatial as triplets (i, j, v), then for each LR pair compute L[i] * R[j] only at nonzero spatial positions. Apply Hill function and agonist/antagonist in vectorized form on the triplet values. This reduces per-LR-pair memory from ~0.9–8 GB to ~125 MB.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions