โ ๏ธ IMPORTANT MEDICAL DISCLAIMER: This is a research and educational tool only. NOT for medical decision-making or health diagnosis.
- Consult healthcare professionals for all health decisions
- Biological age calculations are population-based estimates from research studies
- Not validated for individual health assessment or treatment planning
- Results should not influence medical care without professional consultation
A comprehensive database system for tracking biomarkers and calculating biological age using scientifically validated models. Built with real NHANES health survey data and implements peer-reviewed algorithms for longevity research.
- Track biomarkers over time - Store and analyze 9 key health biomarkers
- Calculate biological age - Using validated Phenotypic Age and Homeostatic Dysregulation models
- Compare to reference ranges - Clinical and longevity-optimized ranges
- Visualize trends - Interactive web interface for data exploration
- Research-grade data - Built on NHANES (CDC health survey) data
- Python 3.11+
- Docker & Docker Compose
- Git
- MySQL 8.0+
# 1. Clone and navigate
git clone https://github.com/yourusername/longevity-biomarker-tracker.git
cd longevity-biomarker-tracker
# 2. Environment setup
cp .env.example .env
make install # Creates virtual environment + installs dependencies
# 3. Start database
make db # Launches MySQL + Adminer web interface
# 4. Start the application (in separate terminals)
make run # API server โ http://localhost:8000
make ui # Web interface โ http://localhost:80
# 5. Optional: Load sample data
make etl # Downloads & processes NHANES data (takes ~10 min)make test # Should pass all testsVisit:
- Web Interface: http://localhost:80
- API Documentation: http://localhost:8000/docs
- Database Admin: http://localhost:8080 (login: biomarker_user / biomarker_pass)
longevity-biomarker-tracker/
โโโ sql/ # ๐๏ธ Database schema & seed data
โ โโโ schema.sql # MySQL tables, views, indexes
โ โโโ 01_seed.sql # Biomarkers, models, reference ranges
โโโ src/
โ โโโ api/ # ๐ FastAPI REST endpoints
โ โ โโโ main.py # User profiles, measurements, biological age
โ โโโ analytics/ # ๐งฎ Scientific calculations
โ โ โโโ hd.py # Homeostatic Dysregulation model
โ โโโ ui/ # ๐ป Web interface
โ โโโ index.html # Interactive query interface
โ โโโ main.js # Biomarker visualizations
โโโ etl/ # ๐ Data pipeline
โ โโโ download_nhanes.py # Fetch CDC health survey data
โ โโโ transform.ipynb # Clean & normalize data
โ โโโ load.sh # Bulk load into database
โโโ tests/ # ๐งช Test suite
โ โโโ test_api.py # API endpoint tests
โ โโโ test_schema.py # Database integrity tests
โโโ docker-compose.yml # ๐ณ MySQL + Adminer containers
โโโ Makefile # ๐ ๏ธ Automation commands
โโโ requirements.txt # ๐ฆ Python dependencies
1. Phenotypic Age (Levine et al. 2018)
- Purpose: Predicts mortality risk better than chronological age
- Method: Regression model trained on 9 biomarkers from NHANES data
- Validation: Tested on 42,000+ adults, published in Aging
- Reference: PMID: 29676998
2. Homeostatic Dysregulation (Cohen et al. 2013)
- Purpose: Measures physiological system coordination
- Method: Mahalanobis distance from healthy young adult reference population
- Innovation: Captures multi-system aging patterns
- Implementation Note: We applied a linear transformation to convert HD scores to "biological age years" for comparison with Phenotypic Age - this is our methodological assumption, not from the original literature
- Reference: PMID: 23376244
- Albumin - Protein synthesis, nutrition
- Alkaline Phosphatase - Liver/bone function
- Creatinine - Kidney function
- Fasting Glucose - Metabolic health
- High-Sensitivity CRP - Inflammation
- White Blood Cell Count - Immune function
- Lymphocyte Percentage - Immune cell distribution
- Mean Corpuscular Volume - Red blood cell size
- Red Cell Distribution Width - Blood cell variation
- NHANES 2017-2018 - CDC's National Health and Nutrition Examination Survey
- Population: Nationally representative US adults
- Sample Size: 9,000+ participants with complete biomarker panels
- Public Domain: De-identified, freely available for research
- MySQL 8.0 with normalized schema (BCNF compliant)
- 9 core tables with foreign key constraints and cascade rules
- 4 analytical views for complex queries
- Performance indexes optimized for biomarker trend analysis
- User Management - Create profiles, track demographics
- Biomarker Storage - Time-series measurement data
- Biological Age Calculation - Real-time computation using validated models
- Reference Ranges - Clinical and longevity-optimized comparisons
- Interactive Queries - Explore data without SQL knowledge
- Trend Visualization - Biomarker changes over time
- Range Comparisons - See how values compare to healthy populations
- Biological Age Dashboard - Track aging metrics
make test # Full test suite
pytest tests/test_api.py # API tests only
pytest tests/test_schema.py # Database tests onlymake db # Start database
make db-reset # Reset schema + seed data
make seed-demo # Load demo users for testingmake lint # Run pre-commit hooks
pre-commit run --all-files # Manual lintingmake etl # Full pipeline: download โ transform โ load
python etl/download_nhanes.py # Download only
jupyter notebook etl/transform.ipynb # Transform only
bash etl/load.sh # Load onlyThe system includes two types of reference ranges:
- Standard laboratory reference values
- Used by healthcare providers for diagnosis
- Based on 95% of healthy population
- Optimized for healthy aging
- Derived from healthiest 20-30 year olds in NHANES
- More restrictive than clinical ranges
- Research-based longevity targets
The web interface supports multiple pre-built queries including:
- User Profiles - Demographics and latest biomarker values
- Biological Age Calculation - Current aging metrics vs chronological age
- Biomarker Trends - Time-series analysis of individual markers
- Range Comparisons - Clinical vs longevity reference ranges
- Population Analytics - Age distributions and biomarker statistics
- No PHI: Uses de-identified NHANES research data only
- Local Development: All data stays on your machine
- No External APIs: Completely self-contained system
- Research Use: Designed for educational and research purposes
This application was developed by:
- Randal Drew
- Sam Fine
- Kelly Luna Jimenez
- YD Song
The team is grateful for the support of Dr. Ahmed Khaled and his TA team for their advice and feedback.
-
Levine, M. E., et al. (2018). An epigenetic biomarker of aging for lifespan and healthspan. Aging, 10(4), 573-591. PMID: 29676998
-
Cohen, A. A., et al. (2013). A novel statistical approach shows evidence for multi-system physiological dysregulation during aging. Mechanisms of Ageing and Development, 134(3-4), 110-117. PMID: 23376244
-
Belsky, D. W., et al. (2015). Quantification of biological aging in young adults. Proceedings of the National Academy of Sciences, 112(30), PMID: 26150497.
-
NHANES - National Health and Nutrition Examination Survey. Centers for Disease Control and Prevention. Available at: https://www.cdc.gov/nchs/nhanes/
MIT License - see LICENSE file for details.
- Research Tool Only: Not intended for clinical use
- Educational Purpose: Designed for learning about biomarker analysis
- Data Limitations: Based on population averages, not individual health status
- Professional Consultation: Always consult healthcare providers for health decisions
Built with scientific rigor using peer-reviewed algorithms and real health survey data. Perfect for researchers, students, and longevity enthusiasts interested in quantified aging.