Exploring neovascular Age-related Macular Degeneration (nAMD) Patient Characteristics (Moorfields Eye Hospital)
- Project Overview
- Data Source
- Tools
- Data Preparation and Cleaning
- Exploratory Data Analysis
- Data Analysis
- Insights
This R-based data analysis project explores the characteristics of patients with neovascular Age-related Macular Degeneration (AMD). The analysis uses an anonymized dataset from Moorfields Eye Hospital containing patients who underwent intravitreal anti-VEGF therapy.
The objective is to uncover demographic patterns and clinical trends across three key areas: gender distribution, age group representation, and the relationship between age and visual acuity. By summarizing unique patient counts for demographic analysis and visualizing the distribution of visual acuity across different age groups, this project aims to provide insights into this specific patient population.
- Source: The
eyedataR package - Dataset:
amd2 - Description: A dataset containing anonymized real-life human subjects data on eyes with treatment-naΓ―ve neovascular age-related macular degeneration (AMD), which underwent intravitreal anti-VEGF therapy with ranibizumab and/or aflibercept.
- Language: R
- Packages:
dplyrβ data manipulation, cleaning, and summarizationggplot2β static, high-quality data visualizationseyedataβ source of theamd2dataset
The following steps were performed in R:
-
Handling Missing Values:
The two missing entries in thevisual_acuitycolumn, which resulted from data entry errors in the source records, were filtered out before visualization. -
Column Renaming:
Columns were renamed for better readability (e.g.,age0βbaseline_age,vaβvisual_acuity). -
Data Summarization:
To analyze demographics accurately, the data was processed to count each patient only once. -
Data Transformation:
A newage_groupcolumn was created by binningbaseline_ageinto 10-year intervals (e.g.,60β69,70β79) to facilitate age-based analysis.
Guiding questions for the EDA included:
- What is the gender distribution of patients in the dataset?
- How are the patients distributed across different age groups?
- How does visual acuity, measured in ETDRS (Early Treatment Diabetic Retinopathy Study) letters, vary across these age groups?
The analysis was performed entirely within R. The dplyr package was used for all data wrangling tasks, including filtering, grouping, and summarizing the dataset to prepare it for visualization. Key steps involved isolating unique patients for demographic counts and categorizing patients into age groups.
Following data preparation, the ggplot2 package was used to generate three plots that visually represent the findings related to patient gender, age, and visual acuity.
Key insights drawn from the analysis:
-
Gender Distribution:
Neovascular age-related macular degeneration (AMD) appears more prevalent in female patients than in male patients within this cohort. -
Age Group Distribution:
As expected for an age-related condition, the patient population is concentrated in the older age brackets. The visualization shows that the 80β89 age group contains the highest number of patients. -
Visual Acuity and Age:
The boxplot shows that visual acuity remains relatively stable in the age groups 60β69, 70β79, 80β89, with only small shifts in the median values, but by 90β99 there is a clearer decline and greater variation, suggesting that advanced age is associated with lower visual acuity. The many outliers clustered at low visual-acuity values across all age groups indicate that severe vision impairment can occur at any age.