Post RFC100 cleanup#1334
Conversation
|
Addressing Rob's PR feedback from #1318:
|
e25fd2e to
89c1dcb
Compare
| """ | ||
| import_genie_dag.py | ||
| Imports Genie study to MySQL and ClickHouse databases using blue/green deployment strategy. | ||
| import_genie_clickhouse_dag.py |
There was a problem hiding this comment.
thin kwe can just standardize to just import_genie? no reference to mysql or clickhouse in general
| data_nodes=("importer_ssh",), | ||
|
|
||
| tasks["data_repos"] >> tasks["verify_management_state"] | ||
|
|
There was a problem hiding this comment.
can we remove al lthese extra new lines
| ), | ||
| db_properties_filename="manage_genie_database_update_tools.properties", | ||
| db_properties_filename="manage_genie_clickhouse_database_update_tools.properties", | ||
| # disabled on pipelines5 machine during testing phase |
There was a problem hiding this comment.
will remove this comment
| """ | ||
| import_public_dag.py | ||
| Imports to Public cBioPortal MySQL and ClickHouse databases using blue/green deployment strategy. | ||
| import_public_clickhouse_dag.py |
There was a problem hiding this comment.
same comments -- a little confused how why the script name here differs from file name
There was a problem hiding this comment.
why was this removed?
There was a problem hiding this comment.
I believed this was dead code, can restore
There was a problem hiding this comment.
why was this erased?
- Rename airflow-import-sql.sh to airflow-import-direct-to-clickhouse.sh - Fix docstrings in genie/public DAGs (remove ClickHouse references) - Collapse extra blank lines in genie DAG - Remove stale 'disabled on pipelines5' comment - Restore monitor-stalled-jobs.sh and test_if_impact_has_lost_allele_count.sh
sheridancbio
left a comment
There was a problem hiding this comment.
Looks good. We are dropping much of the "-clickhouse" strings from names which were distinguishing the clickhouse flavor of the functionality (since that is now the only way).
One larger concern:
Have we actually decided to permanently stop importing these ? (or maybe do it now with airflow?):
- extract projects
- pdx data
- tempo data
- msk-mind
- spectrum
There are commented out lines in the dmp wrapper script which should just be deleted now I guess?
…-pipelines into post-rfc100-cleanup
This can be reviewed properly once the initial rfc100 PR is merged