The DataSetsVerse is a metapackage that brings together a curated collection of R packages containing domain-specific datasets. It includes time series data, educational metrics, crime records, medical datasets, and oncology research data.
Designed to provide researchers, analysts, educators, and data scientists with centralized access to structured and well-documented datasets, this metapackage facilitates:
-
Reproducible research
-
Data exploration
-
Teaching applications
across a wide range of domains.
To install and activate the DataSetsVerse package, use the following:
install.packages("DataSetsVerse")
library(DataSetsVerse)
Once the package is loaded, you can call the DataSetsVerse() function to display the list of included dataset packages and their versions:
DataSetsVerse()
DataSetsVerse imports and depends on several subpackages. Therefore, you cannot detach an individual subpackage (like OncoDataSets) while DataSetsVerse is still loaded.
Example of an Error
# This will raise an error
detach("package:OncoDataSets", unload = TRUE)
# To properly unload a subpackage, you must first detach DataSetsVerse
detach("package:DataSetsVerse", unload = TRUE)
# Now you can safely detach the subpackage
detach("package:OncoDataSets", unload = TRUE)
By installing the DataSetsVerse package this will attach the following Datasets Packages to your R session:
-
timeSeriesDataSets -
educationR -
crimedatasets -
MedDataSets -
OncoDataSets
A comprehensive collection of time series datasets from multiple domains including:
-
Economics
-
Finance
-
Energy
-
Healthcare
Each dataset includes a suffix to denote its structure. Examples:
AirPassengers_ts: Monthly airline passengers (1949–1960)
taylor_30_min_df_ts: Half-hourly electricity demand
-
Contains datasets related to:
-
Student performance
-
Learning methods
-
Test scores
-
Absenteeism
Each dataset includes a suffix to denote its structure. Examples:
Develop_tbl_df: Dev Students: 2-Year & 4-Year College Demographics
Devmath_tbl_df: Fall '95 Developmental Math: Failed Student Scores
-
Focuses exclusively on:
-
Crimes and criminal activities
-
Criminology
-
Socio-economic analysis related to crime
Each dataset includes a suffix to denote its structure. Examples:
TerrorismGlobal_table: Global Terrorism Database (GTD) Yearly Summaries
USATerror_data_df: Terrorism Incidents in the USA (1968-1974)
Medical datasets covering:
-
Drug effectiveness
-
Vaccine trials
-
Survival rates
-
Public health and treatments
Each dataset includes a suffix to denote its structure. Examples:
Aids2_df: Australian AIDS Survival Data
Cushings_df: Diagnostic Tests on Patients with Cushing's Syndrome
Provides rich datasets focused on cancer research, including:
-
Survival rates
-
Genetic studies
-
Biomarkers
Cancer types (melanoma, leukemia, breast, ovarian, lung, etc.)
Each dataset includes a suffix to denote its structure. Examples:
UKLungCancerDeaths_df: Lung Cancer Deaths among UK Physicians
USCancerStats_df: US Cancer Incidence, Mortality, and Survival Changes