Verifies whether PLINK2 uses dosage information or hard calls when scoring with pgen files.
# 1. Setup environment
./setup_environment.sh
# 2. Activate environment
mamba activate plink2_score_dosage_verification
# 3. Run test
./test_plink2_dosage.sh
# 4. View results
firefox plink2_dosage_test/html_report/unified_report.htmlCreates synthetic genetic data with known dosage values, runs PLINK2 scoring, and determines whether PLINK2 uses:
- Dosage information (continuous 0-2 values)
- Hard calls (discrete 0,1,2 genotypes)
Key Discovery: PLINK2 uses dosages when VCF files are imported with the dosage=DS flag.
With proper dosage import (plink2 --vcf file.vcf dosage=DS --make-pgen):
- ✅ Perfect correlation (r = 1.000) with dosage-based expected scores
- ✅ PGEN and BED formats give different results (as expected)
- ✅ All test conditions confirm dosage usage
mamba(recommended) orconda- Internet connection for package installation
./test_plink2_dosage.sh [OPTIONS]
# Common usage
./test_plink2_dosage.sh # Standard test
./test_plink2_dosage.sh --all # All test conditions
./test_plink2_dosage.sh --no-plots # Faster execution
./test_plink2_dosage.sh -s 200 -v 100 --all # Custom size + all tests- Comprehensive report:
plink2_dosage_test/html_report/unified_report.html - Analysis summary:
plink2_dosage_test/comprehensive_analysis.txt - Individual test reports:
plink2_dosage_test/*/html_report/
- Standard: Beta(2,2) distribution dosages
- Edge cases: Dosages near rounding boundaries (0.49, 0.51, 1.49, 1.51)
- Alternative scoring: Different PLINK2 command variations
- Large scale: 5000 samples, 500 variants (optional)
- Distributions: Uniform, bimodal, normal distributions (optional)
PLINK2 requires explicit dosage import: Without the dosage=DS flag, PLINK2 defaults to hard call behavior even when dosage information is available in the VCF file.
Correct commands:
- Import:
plink2 --vcf file.vcf dosage=DS --make-pgen - Export:
plink2 --pfile data --export vcf vcf-dosage=DS
- details.md - Detailed methodology, troubleshooting, technical specifications
- Test reports - Generated HTML reports with comprehensive analysis
- Source code - Well-documented R scripts in
bin/directory