Skip to content

Kassandra seem not finding common genes from 12-13k genes in my expression matrix data. #12

Description

@adesalegn

Hi,
Used the Kassandra model for predicting cell types based on normalized TPM data, but it’s not working as expected. The model is unable to find some of the genes used as features from 12-13K genes in my matrix. How many genes should overlap for the Kassandra model to work?

exprDis = pd.read_csv('~/Users/us/data/transcriptomics.csv', index_col=0)
exprDis.index.name = 'Gene'
exprDis
exprDis = renorm_expressions(exprDis, '/Users/us/data/data/genes_in_expression.txt')
exprDis

# The above codes did not throw any error rather the data frame as indicated above or below 
preds = model.predict(exprDis) * 100

ValueError                                Traceback (most recent call last)
Cell In[23], line 1
----> 1 preds = model.predict(expr) * 100

File ~/Users/us/Kassandra/core/model.py:154, in DeconvolutionModel.predict(self, expr, use_l2, add_other, other_coeff)
    147 def predict(self, expr, use_l2=False, add_other=True, other_coeff=0.073468):
    148     """
    149     Prediction pipeline for the model.
    150     :param expr: pd df with samples in columns and genes in rows
    151     :param predict_cells: If RNA fractions to be recalculated to cells fractions.
    152     :return: pd df with predictions for cell types in rows and samples in columns.
    153     """
--> 154     self.check_expressions(expr)
    155     expr = self.renormalize_expr(expr)
    156     preds = self.predict_l2(expr)

File ~/Users/us/Kassandra/core/model.py:171, in DeconvolutionModel.check_expressions(self, expr)
    169 diff = set(self.cell_types.genes).difference(set(expr.index))
    170 if diff:
--> 171     raise ValueError("EXPRESSION MATRIX HAS TO CONTAIN AT LEAST ALL THE GENES THAT ARE USED AS A FEATURES")
    172 diff = set(self.cell_types.genes).symmetric_difference(set(expr.index))
    173 if not diff:

ValueError: EXPRESSION MATRIX HAS TO CONTAIN AT LEAST ALL THE GENES THAT ARE USED AS A FEATURES

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions