Skip to content

Nirmal2310/NANOTAXI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

302 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NANOTAXI

Offering Real-time 16s DNA Classification of Long Read Sequencing.

Features

  • Faster Results from Real-time Workflow for Multiplexed Data
  • Easy to use GUI for Researchers with minimal coding background
  • Richer Insights from various Downstream analyses and publication-ready plots
  • Offline Analysis using different pipelines available for Nanopore Sequencing

Classification Tools and Database Compatibility in NANOTAXI

Tool Mode GTDB REFSEQ GSR MIMT EMU DB
Kraken2 Both
Minimap2 Both
EMU Both
MMSEQS Both

Legend

  • ✓: Tool supports this database
  • EMU DB: Proprietary database format for EMU.
  • Both: Supported in both real-time mode and offline mode (Post-run).

Installation

To run the app locally, please install R version >= 4.4.2. Also, please ensure that the MinKNOW app version >= 24.06.10.

You can install R and all the required packages using a single command. If not present, this command will install the conda and create a new environment named nanotaxi-env.

if which conda >/dev/null; then
        
        echo "Conda Exist"

else
        source ~/.bashrc
        
        wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh \
        && chmod +x miniconda.sh && bash miniconda.sh -b -p miniconda

        rm -r miniconda.sh
        
        base_dir=$(echo $PWD)
        
        export PATH=$base_dir/miniconda/bin:$PATH
        
        source ~/.bashrc
        
        echo -e "$base_dir/miniconda/etc/profile.d/conda.sh" >> ~/.profile
        
        conda init bash

fi

conda create -n nanotaxi-env --file Installation/nanotaxi-env.txt -y

If you have installed R from the above command, then to run the app, please activate the conda environment first using the following command:

conda activate nanotaxi-env

Launch the app using the following command:

shiny::runApp("main_app.R")

You can also run the app directly from GitHub using the following command:

# Make Sure that the shiny package is installed in the R.
shiny::runGitHub("NANOTAXI", "Nirmal2310")

The app will download and install the required Conda environments and databases necessary for real-time classification of long reads when it starts up for the first time. So, please ensure that enough free space is available on the system.

Running in Remote Server

If you are running the application on the remote server and want to open the interface in the local browser, please use the given step-by-step guide:

  • Assign a port for running a local R Shiny server by using the following command:
echo "options(shiny.port=5316)" >> ~/.Rprofile

Here, I have assigned the port 5316 for running R Shiny Applications. Please ensure that the port is free before assigning it to run Shiny Applications.

  • Connect to the Remote Server in a new window by using the following command:
ssh -L 8080:localhost:5316 hostname@ip_address

Change the local host port number, host name and the IP address accordingly.

  • Now launch the Shiny App through the server's terminal:
shiny::runApp("main_app.R", launch.browser=FALSE)
  • Finally, open NANOTAXI in your browser of choice by entering the following address:
http://127.0.0.1:8080

While setting up the MinKNOW app for sequencing, please make sure that under the barcoding setting, both Trim barcodes and Barcode both ends are disabled as shown below:

Barcode Setup

This is to minimise the rate of unclassified reads during a live run. NANOTAXI utilises Dorado to trim the barcodes during the analysis of demultiplexed barcoded reads, ensuring the removal of synthetic DNA sequence.

Additionally, in the Output settings, please select Number of reads under the Based on option and enter 500 under the Reads threshold option, as the app will process the barcoded data in batches of 500 reads per barcode. The reference screenshot is shown below:

Output Setting


Lastly, Minknow core 5.x requires a secure channel connection to be made by the Minknow API. In order to do this, please add the following line to the ~/.bashrc:

echo "export MINKNOW_TRUSTED_CA=\"/var/lib/minknow/data/rpc-certs/minknow/ca.crt\"" >> ~/.bashrc
source ~/.bashrc

To set up Offline Analysis, tick the checkbox adjacent to the setup option in Offline Analysis under the INPUT tab. It will first download and install all the required software and databases, and then analyse the data.
However, all pipelines are currently available in both real-time and offline settings. All required software and databases will be downloaded and installed during the initial launch of the application.

Offline Setup

Usage/Examples

The Test Dataset is taken from the Bioproject ID PRJEB82315.

We have used Emu to analyse the test dataset comprising 20 samples representing 20 barcodes and classified into three groups based on body fluids.

Barcode Sample Group
barcode01 ERR13935186 Crop Digesta
barcode02 ERR13935187 Crop Digesta
barcode03 ERR13935188 Crop Digesta
barcode04 ERR13935189 Crop Digesta
barcode05 ERR13935191 Crop Digesta
barcode06 ERR13935192 Zymobiomics
barcode07 ERR13935193 Zymobiomics
barcode08 ERR13935195 Zymobiomics
barcode09 ERR13935223 Zymobiomics
barcode10 ERR13935225 Zymobiomics
barcode11 ERR13935170 Feces
barcode12 ERR13935171 Feces
barcode13 ERR13935172 Feces
barcode14 ERR13935174 Feces
barcode15 ERR13935176 Feces
barcode16 ERR13935177 Feces
barcode17 ERR13935178 Feces
barcode18 ERR13935181 Feces
barcode19 ERR13935182 Feces
barcode20 ERR13935184 Feces

The user can run the example dataset by selecting Example Data, add the control group name in Select Control Group and clicking Use Example Data under INPUT tab.

Example Run

Demo

Please see the demo of Real-time classification by using the following link

Documentation

For detailed information about NANOTAXI, please refer to the Documentation.

Roadmap

  • Add Differential Abundance Analysis.
  • Add support for EMUDB, GSR DB, MIMt DB, REFSEQ and GTDB.
  • Add support for PICRUSt2 for functional inference.
  • Add support for SQK-MAB114.24 (16S + ITS).

Authors

Feedback/Help

If you have any feedback/issues, please report the issue via GitHub.

Acknowledgements

Pipelines/Software Used in the App:

Databases Used in the App:

Python Packages Used in the App:

R Packages Used in the App:

About

An End-to-End GUI for real-time taxon identification via barcoded nanopore amplicon sequencing data. The pipeline is built using R Shiny library and majorly targets users with minimal coding background. Users can get from sequencing to results in a few clicks.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors