DeepASM

Last update: May 23, 2024.

Pre-requesites

An account on Google Cloud Platform (GCP).
For each whole genome, you need to output the CpG context files, flag CpGs with the neareast SNP when possible, and flag each sequencing read with REF or ALT (parental chromosome) when possible, using the pipeline CloudASM and store them on BigQuery in a dataset.
Configure the GCP variables in config.yaml accordingly.

All the steps are in main.sh, which can be executed. The main steps are the following:

Create a python-based and bash-based image on GCP's artifact repository using the docker folder in the repository.
define environmental variables in bash using config.yaml
Prepare the reference genome in genomic regions of 250 nucleotides.
Evaluate ASM using CloudASM in the samples (here, ENCODE).
Machine Learning. See our pre-print for a detailed description of the steps.

Name		Name	Last commit message	Last commit date
Latest commit History 447 Commits
app		app
batch-jobs		batch-jobs
deploy		deploy
docker		docker
notebooks		notebooks
scripts		scripts
.gitignore		.gitignore
README.md		README.md
config.yaml		config.yaml
main.sh		main.sh