This repository contains code and analysis for investigating the relationship between pain conditions, opioid prescriptions, and Parkinson's disease using UK Biobank data. The project examines temporal patterns, pain trajectories, and potential associations between chronic pain management and neurological outcomes.
├── README.md
├── requirements.txt
├── data_processing/
│ ├── icd10_processing/
│ │ ├── icd10_remapping_with_ukbb_grouping.ipynb
│ │ ├── icd10_pain_code_search.ipynb
│ │ └── ccsr_icd10_mapping.ipynb
│ ├── opioid_processing/
│ │ ├── opioid_active_ingredient_mapping.ipynb
│ │ ├── mme_calculation.ipynb
│ │ ├── first_phase_bnf_mapping.py
│ │ └── second_phase_bnf_mapping.py
│ └── dataset_creation/
│ ├── comprehensive_dataset_creation.ipynb
│ ├── opioid_demographic_research_analysis.ipynb
│ └── opioid_demographic_technical_merge.ipynb
├── analysis/
│ ├── pain_analysis/
│ │ └── pain_trajectory_analysis.ipynb
│ ├── parkinson_analysis/
│ │ └── parkinson_temporal_analysis.ipynb
│ └── disorder_classification/
│ ├── disorder_feature_analysis.ipynb
│ ├── disorder_classification_demography.ipynb
│ ├── disorder_classification_pain_opioid.ipynb
└── docs/
└── methodology.md
This project processes and analyzes UK Biobank data to create comprehensive research datasets including:
- Demographics: Age, sex, BMI, and other baseline characteristics
- Medical History: Complete ICD-10 coded diagnoses with CCSR classifications
- Pain Conditions: Systematically identified pain-related diagnoses
- Opioid Prescriptions: Detailed medication data with Morphine Milligram Equivalents (MME)
- Parkinson's Disease: Diagnosis codes and temporal relationships
- Temporal Analysis: Longitudinal tracking of pain, prescriptions, and outcomes
- ICD-10 code processing and pain condition identification
- Opioid prescription mapping with active ingredient extraction
- BNF (British National Formulary) code mapping (fast and slow algorithms)
- Morphine Milligram Equivalent (MME) calculations
- Demographic data integration
- Temporal sequence analysis
- Pain Trajectory Analysis: Longitudinal patterns of pain conditions
- Opioid Risk Assessment: MME-based risk stratification
- Parkinson's Association Studies: Temporal relationships between pain/opioids and neurological outcomes
- Disorder Classification: Multi-label classification using machine learning
pip install -r requirements.txt- pandas >= 1.5.0
- numpy >= 1.21.0
- scikit-learn >= 1.0.0
- matplotlib >= 3.5.0
- seaborn >= 0.11.0
- jupyter >= 1.0.0
- tqdm >= 4.64.0
- thefuzz >= 0.19.0 (for fuzzy string matching)
- ast (built-in)
- datetime (built-in)
Start with the data processing notebooks in order:
# ICD-10 processing
jupyter notebook data_processing/icd10_processing/icd10_remapping_with_ukbb_grouping.ipynb
# BNF code mapping (First phase is for fast mapping and the second phase is slower for mapping BNFs unmapped in the first phase)
python data_processing/opioid_processing/first_phase_bnf_mapping.py # First fast mapping
python data_processing/opioid_processing/second_phase_bnf_mapping.py # For comprehensive mapping
# Opioid data processing
jupyter notebook data_processing/opioid_processing/opioid_active_ingredient_mapping.ipynb
# MME calculations
jupyter notebook data_processing/opioid_processing/mme_calculation.ipynb
# Dataset creation and merging
jupyter notebook data_processing/dataset_creation/opioid_demographic_research_analysis.ipynb
jupyter notebook data_processing/dataset_creation/opioid_demographic_technical_merge.ipynb
jupyter notebook data_processing/dataset_creation/comprehensive_dataset_creation.ipynbRun analysis notebooks based on your research questions:
# Pain pattern analysis
jupyter notebook analysis/pain_analysis/pain_trajectory_analysis.ipynb
# Parkinson's disease analysis
jupyter notebook analysis/parkinson_analysis/parkinson_temporal_analysis.ipynb
# Disorder classification
jupyter notebook analysis/disorder_classification/disorder_classification_pain_opioid.ipynb
jupyter notebook analysis/disorder_classification/disorder_feature_analysis.ipynb
jupyter notebook analysis/disorder_classification/disorder_classification_demography.ipynb- Systematic search of ICD-10 codes using pain-related keywords
- Clinical validation of identified codes
- CCSR (Clinical Classifications Software Refined) mapping
- MME calculations based on prescription strength and quantity
- Risk stratification (low, moderate, high risk categories)
- Temporal exposure windows (12-month and 24-month lookbacks)
- Prospective cohort design
- Time-to-event analysis
- Survival analysis methods
- Multi-label classification for disorder prediction
- Feature engineering for temporal patterns
- Patient ID: Unique identifier (eid)
- Demographics: Age, sex, BMI, ethnicity
- ICD-10 Codes: Complete diagnostic history with dates
- Pain Features: Pain-specific diagnostic codes and patterns
- Opioid Data: Prescription details, MME calculations, risk categories
- Temporal Variables: Follow-up periods, sequence of events
- Parkinson's disease diagnosis
- Pain trajectory patterns
- Opioid exposure risk levels
This project uses UK Biobank data under approved research applications. All analyses comply with UK Biobank access procedures and ethical guidelines.
Important Notes:
- Raw data files are not included in this repository
- Only analysis code and documentation are shared
- Researchers must obtain independent UK Biobank access for data
Key findings from this analysis include:
- Systematic identification of pain-related diagnoses in UK Biobank
- Development of comprehensive opioid risk assessment methods
- Investigation of temporal relationships between pain, opioid use, and neurological outcomes
This repository represents a research internship project. For questions about methodologies or access to specific analysis results, please contact the research team.
If you use methods from this repository in your research, please cite:
Research conducted at the Sandor Lab, UK Dementia Research Institute at Imperial College London.
Lab website: https://www.ukdri.ac.uk/labs/sandor-lab
This code is provided for research purposes. Please refer to UK Biobank data usage policies for data access requirements.
For questions about this research methodology or potential collaboration opportunities, please contact:
- Sandor Lab: Lab website
Disclaimer: This repository contains analysis code only. Access to UK Biobank data requires separate application and approval through official channels.