Weighted Gene Co-expression Network Analysis (WGCNA) WGCNA is a powerful systems biology method used to describe the correlation patterns among genes across multiple samples. Unlike traditional differential expression analysis that looks at genes in isolation, WGCNA treats the transcriptome as a coordinated network.
How it Works:
- Network Construction: Instead of using unweighted "all-or-nothing" connections, WGCNA uses a "soft-threshold" to preserve the continuous nature of gene-to-gene correlations.
- Module Detection: Genes with similar expression profiles are clustered into modules (often represented by colors). In plant science, these modules frequently represent specific biological pathways or co-regulated gene families.
- Trait Association: We correlate these modules with phenotypic traits (e.g., drought tolerance, yield, metabolite levels) to identify which gene clusters drive specific biological outcomes.
- Hub Gene Identification: Within each module, we identify "Hub Genes"—the most highly connected nodes that serve as primary regulators of the biological response.
Required inputs:
- Gene expression count table (or TPM). This can be generated by RNA-seq_to_TPM_Bowtie2 or RNA-seq_to_TPM_STAR in this GitHub repository.
- Traits or experimental design file. You can refer to the format csv file provided in this repository.
Post analysis:
- Check the results from the HTML report.
- You can visualize the gene network using Cytoscape or Gephi.
- Install EG_tools (*** If this is already installed, skip this step ***)
wget https://github.com/euchrogene/EG_tools/raw/refs/heads/main/EG_tools
sudo chmod 777 EG_tools
sudo mv EG_tools /usr/bin
- Install the software:
sudo EG_tools install -r https://github.com/euchrogene/WGCNA.git -d WGCNA -e WGCNA_v.1.0 -m "Gene network analysis pipeline using WGCNA."
- Display installed software
EG_tools
- Download traits format examples
wget https://github.com/euchrogene/WGCNA/raw/refs/heads/main/Time_series_format_one_col.csv
wget https://github.com/euchrogene/WGCNA/raw/refs/heads/main/Time_series_format_separate_col.csv
wget https://github.com/euchrogene/WGCNA/raw/refs/heads/main/Traits_format.csv
- Show help contents
WGCNA_v.1.0
============================================================================
EuchroGene WGCNA Pipeline v.1.0 - Publication-Grade Network Analysis
============================================================================
DESCRIPTION:
Automated weighted gene co-expression network analysis for systems biology
research. Generates publication-ready figures, statistical analysis, and
comprehensive interpretation reports.
USAGE:
wgcna_wrapper.py <MODE> [OPTIONS]
MODES:
all Complete automated pipeline (RECOMMENDED)
Runs: Construction → Hubs → Traits → Export → Reports
Required:
-i <FILE> Input expression matrix (CSV)
Format: Rows=Genes, Cols=Samples
First column = Gene IDs
Optional:
-o <DIR> Output directory (default: wgcna_results)
-type <TYPE> Data type: counts|tpm|fpkm (default: counts)
-traits <FILE> Trait/phenotype data (CSV)
Format: Rows=Samples, Cols=Traits
-p <N> CPU threads (default: 30)
-b <N> Block size for large datasets (default: 30000)
-n <N|all> Modules to export (default: 20)
Use 'all' to export every module
-zip Create zip archive of results (default: enabled)
run Step 1: Network construction only
traits Step 2: Module-trait correlation analysis
hubs Step 3: Hub gene identification
export Step 4: Export networks for Cytoscape/Gephi
EXAMPLES:
# Full automated analysis with trait data
wgcna_wrapper.py all -i expression.csv -traits phenotypes.csv
# Quick analysis without traits
wgcna_wrapper.py all -i expression.csv -o my_results
# High-memory server optimization
wgcna_wrapper.py all -i expression.csv -p 64 -b 50000
OUTPUT FILES:
Directory structure:
wgcna_results/
├── WGCNA_Report.html # Main interpretation guide
├── Methods_Publication.txt # Ready-to-use methods section
├── wgcna_sft_plot.pdf # Scale-free topology
├── wgcna_dendrogram.pdf # Gene clustering tree
├── wgcna_modules.csv # Gene-module assignments
├── wgcna_trait_heatmap.pdf # Module-trait correlations
├── wgcna_top10_hubs.csv # Top hub genes per module
├── [module]_cytoscape_edges.txt # Cytoscape import files
└── [module]_network_plot.pdf # Network visualizations
Compressed archive:
wgcna_results.zip # Complete results package
SUPPORT:
Bugs/Questions: bioinformatics@euchrogene.com
============================================================================
- Uninstall v.1.0
sudo EG_tools uninstall -t WGCNA_v.1.0 -i managene7/wgcna_package:v.2.0