Genetics Ark is a Django based web interface for hosting apps used by clinical scientists for managing and interpretating sequencing data.
- GRCh37/38 reference files for primer designer (human reference genome & SNPs VCF)
- reference files for IGV.js (fasta, fai, cytoband, refseq)
- Docker & Docker Compose
- Primer Designer (deployed on Docker)
Genetics Ark allows primer input submission: <chromosome>:<position> <genome build>
Genetics Ark allows igv searching for BAM or CNV samples (login required)
Genetics Ark requires 2 local files containing environment variables:
- A small
.envfile, kept in the same directory as your docker-compose.yml file. This only contains paths to mounted volumes, plus the path to the main config.txt file, given by GA_CONFIG_PATH. By adjusting this, the user can change their main config path for the docker-compose.yml without having to edit the docker-compose.yml directly. See the example.env. - A 'config.txt' file, which contains the majority of the environment variables. See example.config.txt for annotations.
In addition, you'll need to check that nginx/nginx.conf displays the correct ports for Genetics Ark. In the upstream ga{} section, ensure the port matches the one for genetics-ark-web.
- By default, find_dx_data.py runs every 15 minutes, and checks for new samples in DNAnexus which can be made available to IGV. A script which clears out a temporary directory runs every morning at 2am.
- Both the above cron jobs, on successful completion, emit text log files plus Prometheus-formatted metric files. The metrics can be used with Prometheus to send alerts, if they do not run when expected.
- Edit the
crontabfile to tweak the cron schedule.
# start cron
0 2 * * * rm -rf /home/tmp/* && echo "`date +\%Y\%m\%d-\%H:\%M:\%S` tmp folder cleared" >> /home/log/ga-cron.log 2>&1 && echo "`date +\%Y\%m\%d-\%H:\%M:\%S` sample file updated" >> /home/log/ga-cron.log 2>&1 && /usr/local/bin/python -u /home/emit_prom_metric.py "ga_temp_deleted"
*/15 * * * * /usr/local/bin/python -u /home/find_dx_data.py >> /home/log/ga-cron.log 2>&1 && echo "`date +\%Y\%m\%d-\%H:\%M:\%S` sample file updated" >> /home/log/ga-cron.log 2>&1 && /usr/local/bin/python -u /home/emit_prom_metric.py "ga_cron_completed"
# end cron
All cron run logs will be stored in cron container /home/log/cron.log
Ensure the following environment variables are correct:
- change logging location in
ga_core/settings.py - change database setting to localhost database
- change redis setting to localhost redis
You must also run a server, with python manage.py runserver
Ensure GENETIC_DEBUG is not in config file, to run in production mode
docker compose build
docker compose up db -d # start db first and create a database named genetics
docker compose up web -d # start web container and run python manage.py migrate
docker compose up
This will spin up 6 containers: web, cron, nginx, database, redis, djangoq
Main django web interface
Cron schedule for updating igv samples jsons & removing generated primer design PDFs in /home/tmp
Nginx server used to serve django staticfiles, reference files for igv.js and to download primer designer generated zipfile
Django-q queue system for primer design task
Redis as queue broker
MySQL database
View the 'example.env' file for descriptions of the required environment variables, which should be stored in a '.env' file locally. .env files must not be version-controlled.
-
Primer designer: App for designing new sequencing primers, utilises primer3 for designing primers and returns a .pdf report
-
DNAnexus_to_igv: App to link samples stored in the DNAnexus cloud platform with Genetics Ark. On searching for a sample (BAM or CNV), if it is found within a 002 sequencing project within DNAnexus (for BAM) or in
PROJECT_CNVS(for CNVs), download urls are provided for the file and its index file to load within IGV installed on a PC. A link to stream the file directly to IGV.js is also provided. cron container will periodically run find_dx_data.py to update the.jsonof samples
N/A