Graph Convolutional Matrix Factorization by PascalIversen · Pull Request #442 · daisybio/drevalpy

PascalIversen · 2026-06-23T14:09:08Z

Graph convolutional matrix factorization for drug-response prediction. Predicts the cell-line x drug response matrix as R = U V^T (as in normal matrix fact.) But the latent factors are learned end2end by graph convolutions over feature-similarity or prior knowledge graphs. One can use mutiple graphs for drugs and cell lines in the relational gcmf version.
The P-versions also produce aleatoric uncertainty estimates.
The main idea is to combine learning in primal and dual space (one can use a large-dimensional feature space and transform it to a similarity, and exploit it without blowing up the model complexity). Also, the graph convolution smooths the embeddings with those of the neighboring cells, which I hope prevents overfitting/helps generalization.

The variants are

GCMF: base single-graph model.
RGCMF: relational/multi-graph variant (currently with multi-omics cell graphs; pathway and
bioassay drug relations) fused by a relational graph convolution. (works best on LCO per my experiments)
PGCMF / PRGCMF: probabilistic variants with a heteroscedastic Gaussian-NLL
head emitting calibrated per-prediction uncertainty.

Per my experiments this is chocolate worthy but will rerun on the leaderboard scripts.

I will remove the csvs etc, just wanted to backup here

Graph-convolutional matrix factorization for drug-response prediction. Predicts the cell-line x drug response matrix as R = U V^T (as in ordinary matrix factorization), but the latent factors are learned end-to-end by graph convolutions over feature-similarity or prior-knowledge graphs (k-NN gene-expression graph for cell lines, Morgan-fingerprint Tanimoto graph for drugs). Each convolution smooths a node's embedding over its graph neighbours; the smoothed factors are read out by the dot product, plus per-cell/per-drug biases and an optional MLP head. Four selectable models, all registered in MULTI_DRUG_MODEL_FACTORY: * GCMF - base single-graph model. * RGCMF - relational/multi-graph variant (multi-omics cell graphs; pathway and bioassay drug relations) fused by a relational graph convolution. Works best on leave-cell-out in our experiments. * PGCMF / PRGCMF - probabilistic variants with a heteroscedastic Gaussian-NLL head emitting calibrated per-prediction aleatoric uncertainty. Includes hyperparameters.yaml for all four, bundled gzipped drug-relation resources for RGCMF/PRGCMF, smoke tests (train/predict/save-load round-trip), docs (model page + toctree + usage table), and a check-added-large-files exclude for the bundled resources. Lint/type clean: flake8 0, mypy 0, black/isort clean, pre-commit green.

PascalIversen marked this pull request as draft June 23, 2026 14:09

PascalIversen force-pushed the feat/gcmf-geometric-mf branch from 60e0898 to 9ec282e Compare June 23, 2026 14:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Graph Convolutional Matrix Factorization #442

Graph Convolutional Matrix Factorization #442
PascalIversen wants to merge 1 commit into
developmentfrom
feat/gcmf-geometric-mf

PascalIversen commented Jun 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

PascalIversen commented Jun 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant