This repository lists extremely handful online tutorial needed for daily bioinformatics work. The content separates in five major sections including Linux, R, Python , Framework and Theory.
Bioinformatics Data Skills: Really useful introduction to basic bioinformatics skill for beginners. Its PDF document settles in Linux folder.
GNU Parallel: a shell tool for executing jobs in parallel using one or more computers. Its PDF document settles in Linux folder.
GNU Screen: a full-screen window manager that multiplexes a physical terminal between several processes, typically interactive shells. A shortcut cheat sheet can be found here.
R for data science: If you use R to do data analysis, it is the first book you have to read. Maybe you prefer Chinese tutorial, another great book can be found here. As you feel confident about your R skills, you can try solving issues listed here.
ggplot2: Elegant Graphics for Data Analysis: introduction to underlying graphic grammer behind ggplot2.
Basic use of Python: may be the best Chinese Python tutorial for beginners.
Machine learning based on sklearn:how to use Python sklearn package to make machine learning happen. Its PDF documents settle in Machine_learning folder.
Pytorch-based deep learning: A step-by-step tutorial for beginners to make deep-learning happen using Pytorch tool box.
Transformers: A step-by-step tutorial for using transformers package.
Practice AI: Best practice to learn AI from the perspective of product manager.
set up claude code: set up claude code in Linux.
set up claude code (V2): another provider.
WDL: The Workflow Description Language (WDL) is a way to specify data processing workflows with a human-readable and writeable syntax. Its user guide can be found here. WDL relies on excuting engine called cromwell, whose java ball can be downloaded here
Build GTP step by step: A tutorial teaching how to make GPT architecture come true.
Harvard Stat 115: gives the beginners an overview of the proceedings in bioinformatics and computational biology and explains classic algorithm well.
Linear algebra: basic linear algebra knowledge required for understanding deep learning models.
Probability: The essence of deep learning is probability which you should learn as much as you can.
Statsitics: introduction to Mathematical Statistics.
Great suggestion for data visualization: tips not to make bad choice when visualization.
Single cell tutorial: Great single-cell RNA-seq data analysis workshop.
Bulk RNAseq tutorial: Great Bulk RNA-seq data analysis workshop.
Single-cell best practices: A useful book that is updated frequently for the best practice in single cell RNAseq.
Raincloud plot: A hybrid plot mixing a halved violin plot, a box plot, and scattered raw data, can help us visualize raw data, the distribution of the data, and key summary statistics at the same time.
Mediation analysis: A tutorial for mediation analysis using R software.
Treemap plot:How to make Treemap plot in python using plotly.
Dendrogram:How to optimize basic dendrogram in R.
Gradient color palette:Nice gradient color palette.