Skip to content

It it possible that slingshot sets different starting clusters for different conditions? #46

@LilianGomes

Description

@LilianGomes

Hi all,
I’m running Slingshot on two experimental conditions and encountering an interesting issue:
depending on the condition, my lineage curves appear to start from different clusters.

Specifically:

Control (ctrl) cells start at cluster “vRG”

Infected cells start at cluster “oRG”

However, when I check the computed lineages, they all seem to start at vRG, regardless of condition.
This raises the question — is this a biological effect or a possible confusion in the Slingshot algorithm due to differences in cell composition between conditions?


$Lineage1
[1] "vRG"           "tRG"           "oRG"           "RG_astrocytic"

$Lineage2
[1] "vRG"          "oRG_dividing" "IPC_IAO"      "vRG_dividing"

$Lineage3
[1] "vRG"              "oRG_dividing"     "IPC_IAO"          "IPC_IAO_dividing"

$Lineage4
[1] "vRG"             "oRG_dividing"    "IPC_IAO"         "IPC_EN_dividing"

$Lineage5
[1] "vRG"          "oRG_dividing" "Astrocytes"  

$Lineage6
[1] "vRG"    "IPC_EN"

$Lineage7
[1] "IPC_EN_late"

here is my code


 # Convert Seurat object to SingleCellExperiment
All_integrated_prog_sce <- as.SingleCellExperiment(All_integrated_prog, assay = "SCT")

# Transfer reductions to SCE
reducedDims(All_integrated_prog_sce) <- list(
  PCA = Embeddings(All_integrated_prog, "pca"),
  UMAP = Embeddings(All_integrated_prog, "umap.rpca")
)

# Run Slingshot with both possible starting clusters
All_integrated_slingshot <- slingshot(
  All_integrated_prog_sce,
  reducedDim = "UMAP",
  clusterLabels = All_integrated_prog_sce$cell_type,
  stretch = 0,
  extend = "n",
  omega = TRUE,
  start.clus = c("vRG", "oRG")
)

# Combine reduced dimensions and metadata
df <- bind_cols(
  reducedDim(All_integrated_slingshot, "UMAP") %>% as.data.frame(),
  colData(All_integrated_slingshot)[, 1:14] %>% as.data.frame()
)

# Run Slingshot by condition
sdss <- slingshot_conditions(
  SlingshotDataSet(All_integrated_slingshot),
  All_integrated_slingshot$exp_group
)

# Extract curve data
curves <- dplyr::bind_rows(
  lapply(sdss, slingCurves, as.df = TRUE),
  .id = "exp_group"
)

# Compute pseudotime and weights
pt <- slingPseudotime(All_integrated_slingshot, na = TRUE) %>%
  as.data.frame() %>%
  rename_with(~ paste0("Lineage", seq_along(.), "_pst"))

w <- slingCurveWeights(All_integrated_slingshot) %>%
  as.data.frame() %>%
  rename_with(~ paste0("Lineage", seq_along(.)))

# Combine everything
df1 <- bind_cols(
  reducedDim(All_integrated_slingshot, "UMAP") %>%
    as.data.frame() %>%
    setNames(c("UMAP_1","UMAP_2")),
  pt, w,
  tibble(
    exp_group = All_integrated_slingshot$exp_group,
    cell_type = All_integrated_slingshot$cell_type
  )
) %>%
  rowwise() %>%
  mutate(
    # Select lineage based on weights only
    max_weight_index = which.max(c_across(matches("^Lineage\\d+$"))),
    # Assign pseudotime from the matching lineage
    pst = c_across(matches("^Lineage\\d+_pst$"))[max_weight_index]
  ) %>%
  ungroup()

Is the observation that control lineages start at “vRG” while infected ones start at “oRG” biologically plausible, or could this be a consequence of Slingshot’s behavior when the starting cluster varies across conditions (e.g., due to differences in cluster composition or ordering in exp_group)? dashed line is the infected condition.

Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions