Skip to content

fix(target): remove deprecated Project Score cross-references#138

Merged
d0choa merged 1 commit into
mainfrom
fix/target-remove-project-score-xrefs
Jun 8, 2026
Merged

fix(target): remove deprecated Project Score cross-references#138
d0choa merged 1 commit into
mainfrom
fix/target-remove-project-score-xrefs

Conversation

@d0choa

@d0choa d0choa commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

Summary

Project Score has been deprecated upstream. This drops the xref pipeline that emitted {id: SIDG..., source: 'ProjectScore'} entries into the target dbXrefs field.

  • Remove _build_project_scores (and its two project-scores/ parquet inputs, the join into hgnc_ensembl_go, and the xRef column merged into dbXrefs) from src/pts/pyspark/target.py
  • Drop the 3 project-scores pre-processing tasks and the 2 project_scores_* source entries on the target pyspark step in config.yaml

Out of scope (intentionally unaffected):

  • src/pts/pyspark/project_score.py — Project Score v2 CRISPR evidence parser (separate dataset, still maintained)
  • target_gene_essentiality step — backed by DepMap, not Project Score

Follow-up PRs needed:

  • pis — drop the two cog.sanger.ac.uk/cmp/download/ fetches under input/target/project-scores/
  • orchestration — mirror both changes in dags/config/{pis,pts}.yaml and remove the project-scores/ entries from scripts/validate_pts_step.py

Refs opentargets/issues#4367

Test plan

  • Spot-check the output target parquet to verify dbXrefs no longer contains source: ProjectScore entries
  • Confirm gene_essentiality output is unchanged

@d0choa d0choa marked this pull request as ready for review June 1, 2026 16:34
@d0choa d0choa requested a review from DSuveges June 1, 2026 16:34
DSuveges pushed a commit to opentargets/orchestration that referenced this pull request Jun 5, 2026
Mirror the pis and pts cleanup: drop the Sanger Project Score fetch
tasks from pis.yaml and the matching pre-processing + source entries
from pts.yaml. The deprecated resource (cog.sanger.ac.uk/cmp/download/
gene_identifiers_latest.csv.gz + essentiality_matrices.zip) is no
longer consumed by the pts target step.

The Sanger Cell Model Passport download (model_list_latest.csv.gz,
used by the Project Score v2 CRISPR evidence parser) is untouched.

Companion PRs:
- pts: opentargets/pts#138
- pis: opentargets/pis#201

Refs opentargets/issues#4367
Project Score has been deprecated upstream. Drop the xref pipeline
that emitted {id: SIDG..., source: 'ProjectScore'} entries into the
target dbXrefs field:

- Remove _build_project_scores and its inputs/join from target.py
- Drop the project-scores pre-processing and source entries from
  config.yaml

The Project Score v2 CRISPR evidence parser (project_score.py) and
the DepMap-backed target_gene_essentiality step are unaffected.

Refs opentargets/issues#4367
@DSuveges DSuveges force-pushed the fix/target-remove-project-score-xrefs branch from 46cee68 to 4ec41da Compare June 5, 2026 13:38
@d0choa d0choa merged commit 92dc46a into main Jun 8, 2026
2 checks passed
@d0choa d0choa deleted the fix/target-remove-project-score-xrefs branch June 8, 2026 09:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants