Skip to content

Add sparsego#439

Open
tereshchuk1 wants to merge 10 commits into
daisybio:developmentfrom
tereshchuk1:add-sparsego
Open

Add sparsego#439
tereshchuk1 wants to merge 10 commits into
daisybio:developmentfrom
tereshchuk1:add-sparsego

Conversation

@tereshchuk1

Copy link
Copy Markdown
Contributor

PR Checklist for all PRs

  • This comment contains a description of changes (with reason)

New features

Added SparseGO model for drug response prediction based

  • drevalpy/models/SparseGO/ — model implementation (sparse VNN + drug ANN)
  • drevalpy/datasets/featurizer/create_sparsego_features.py — featurizer to generate GO ontology input files
  • registered in MODEL_FACTORY, hyperparameters from the original paper, optional dependencies mygene and obonet added

@codecov-commenter

codecov-commenter commented Jun 20, 2026

Copy link
Copy Markdown

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 28.07692% with 374 lines in your changes missing coverage. Please review.
✅ Project coverage is 77.97%. Comparing base (7d24cd6) to head (f02b019).
⚠️ Report is 20 commits behind head on development.

Files with missing lines Patch % Lines
drevalpy/models/SparseGO/sparsego.py 13.26% 268 Missing ⚠️
drevalpy/models/SparseGO/utils.py 10.57% 93 Missing ⚠️
drevalpy/models/baselines/multi_view_lightgbm.py 87.00% 13 Missing ⚠️
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.
Additional details and impacted files
@@               Coverage Diff               @@
##           development     #439      +/-   ##
===============================================
- Coverage        80.34%   77.97%   -2.37%     
===============================================
  Files              101      106       +5     
  Lines             8171     8860     +689     
===============================================
+ Hits              6565     6909     +344     
- Misses            1606     1951     +345     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not be in this PR :)

input_type = self.hyperparameters.get("input_type", "expression")
feature_type = "gene_expression" if input_type == "expression" else "mutations"

cell_line_features = load_and_select_gene_features(

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With gene_list=None, load_and_select_gene_features returns all genes from gene_expression.csv in order of the CSV, not just the ontology genes, and not in gene2ind.txt order. The model is then trained on the wrong inputs I think. Could you check this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants