Hi,
I am trying to replicate the results from the Proseg paper, specifically with the Xenium lung cancer dataset. I am not able to replicate the plots in Figure 5, and would like to ask what parameters you used to run Proseg, and also in the preprocessing and subsequent analysis.
I used Proseg v3.1.0 with the following command:
proseg --output-spatialdata output_replication.zarr --output-expected-counts replication_expected_counts.mtx.gz --xenium ../transcripts.parquet
For the analysis, I used the following code to filter out cells with less than 10 transcripts and take the censored log proportions, as per the paper.
sc.pp.filter_cells(adata, min_counts=10)
cell_sums = adata.X.sum(axis=1)
temp_X = adata.X.copy()
temp_X = temp_X.toarray().astype(np.float64)
temp_X = temp_X / cell_sums
temp_X = np.clip(temp_X, a_min = 1e-4, a_max = None)
temp_X = np.log(temp_X)
adata.X = temp_X.copy()
Then I used the following code to perform dimensionality reduction and clustering.
sc.pp.neighbors(adata, n_neighbors = 15, use_rep = 'X')
sc.tl.umap(adata)
sc.tl.leiden(adata, resolution= 1)
sc.pl.umap(adata, color='leiden')
The UMAPs I got did not resemble those of Figure 5A - I have attached here the plots. If you could let me know how you analysed the Xenium lung cancer sample, I would greatly appreciate it. Thank you very much!
