Hi,
I applied cNMF to analyze my scRNA-seq data, and I focused on two output files: gene_spectra_score.k_3.dt_0_1.txt and gene_spectra_tpm.k_3.dt_0_1.txt. According to the manual, these files should represent essentially the same information but on different scales (Z-score vs. TPM). However, I noticed a significant discrepancy between the top 50 genes from each file, with only 15 genes overlapping between them. I used the R code below to extract the top genes for each GEP. Could there be an issue with my approach?
##########################################
exprSet = read.table(file = "MC.Merge.gene_spectra_score.k_3.dt_0_1.txt", header = TRUE, sep = "\t")
OR
exprSet = read.table(file = "MC.Merge.gene_spectra_tpm.k_3.dt_0_1.txt", header = TRUE, sep = "\t")
exprSet = exprSet[,-1]
rownames(exprSet) = paste(rep("C", 3), seq(1, 3, 1), sep = "")
top_genes = apply(exprSet, 1, function(x){ names(sort(x, decreasing = TRUE))[1:50]})
###########################################
Additionally, I exported the counts file with only HVGs from Seurat before applying cNMF, assuming it would save computation time. Is this approach acceptable?
Hi,
I applied cNMF to analyze my scRNA-seq data, and I focused on two output files: gene_spectra_score.k_3.dt_0_1.txt and gene_spectra_tpm.k_3.dt_0_1.txt. According to the manual, these files should represent essentially the same information but on different scales (Z-score vs. TPM). However, I noticed a significant discrepancy between the top 50 genes from each file, with only 15 genes overlapping between them. I used the R code below to extract the top genes for each GEP. Could there be an issue with my approach?
##########################################
exprSet = read.table(file = "MC.Merge.gene_spectra_score.k_3.dt_0_1.txt", header = TRUE, sep = "\t")
OR
exprSet = read.table(file = "MC.Merge.gene_spectra_tpm.k_3.dt_0_1.txt", header = TRUE, sep = "\t")
exprSet = exprSet[,-1]
rownames(exprSet) = paste(rep("C", 3), seq(1, 3, 1), sep = "")
top_genes = apply(exprSet, 1, function(x){ names(sort(x, decreasing = TRUE))[1:50]})
###########################################
Additionally, I exported the counts file with only HVGs from Seurat before applying cNMF, assuming it would save computation time. Is this approach acceptable?