GitHub - stasaki/DEcodeTree

DEcodeTree is a machine learning model designed to predict differential expression statistics from gene regulatory features.

📜 Link to the Paper

Read our full research paper for more details:
The YTHDF Proteins Shape the Brain Gene Signatures of Alzheimer's Disease

Installing

To install the DEcodeTree package, use the following commands:

install.packages("devtools")
devtools::install_github("stasaki/DEcodeTree")

Running the model

Quick Test Run

The following is a test run using only cell type information as predictors:

library(DEcodeTree)

# Load differential expression statistics for RNA-seq from DLPFC regions against cognitive decline in ROSMAP cohorts
deg_stat_data <- readRDS(system.file("data", "DLPFC_CogDec_DEG_stats.rds", package = "DEcodeTree"))

# Run the DEcodeTree model using cell type information as predictors
runDEcodeTreeModel(
  outcome = "tstat",
  deg_stat = deg_stat_data,
  feature_type = c("cell_type"),
  out_dir = "./results/",
  n_splits = 5,
  metric = c("COR", "RMSE"),
  num_samples = 10,
  num_cpus = 4
)

Full Feature Analysis

To run the complete analysis with all regulatory features, use:

⏱️ Runtime Warning: The following analysis uses 16 CPU cores and may take 1-4 hours to complete depending on your system's performance. The model performs extensive hyperparameter tuning (64 iterations) and cross-validation across multiple feature types.

runDEcodeTreeModel(
  outcome = "tstat",                           # Predict t-statistics
  deg_stat = deg_stat_data,                   # Our DE results
  feature_type = c("base", "RBP", "miRNA", "TF", "cell_type"), # All regulatory features
  out_dir = "./results/",                     # Output directory
  n_splits = 5,                               # Cross-validation splits
  metric = c("COR", "RMSE"),                  # Performance metrics
  num_samples = 64,                           # Hyperparameter tuning iterations
  num_cpus = 16                               # Parallel processing cores
)

Additional Examples

We also provide an example of running DEcodeTree for TDP43 knockdown RNA-seq data in TDP43_example.md.

Parameters

Parameter	Description	Default
`outcome`	A character string specifying the outcome variable (e.g., "logFC", "tstat").	`"tstat"`
`deg_stat`	A data frame containing differential expression statistics.	Required
`feature_type`	Features to include (`"base"`, `"RBP"`, `"miRNA"`, `"TF"`, `"cell_type"`).	`c("base")`
`out_dir`	Output directory for results.	`"./results/"`
`n_splits`	Number of data splits for cross-validation.	`5`
`metric`	Performance metrics (`"COR"`, `"RMSE"`).	`c("COR", "RMSE")`
`measures`	List of measures for performance evaluation compatible with `mlr`.	Default measures
`cv_folds`	Number of folds for cross-validation.	`5`
`num_samples`	Number of hyperparameter tuning iterations.	`64`
`num_cpus`	Number of CPUs for parallel processing.	`4`
`param_values`	Default hyperparameter values for XGBoost models.	Predefined list
`param_space`	Parameter set defining the hyperparameter optimization space.	Predefined set
`save_model`	Whether to save trained models.	`TRUE`

Outputs

Evaluation Results: Summary of model performance metrics (e.g., RMSE, correlation).
SHAP Values: Feature importance values for interpretability.
Plots: Visualizations of model performance and SHAP scores.
Trained Models: Serialized XGBoost models for future use.

License

The code for this project is licensed under the BSD 3-Clause License. Gene regulatory features were prepared using the following databases. Please ensure compliance with the respective licenses for each data source:

Transcription factor binding targets: GTRD
RNA-binding protein binding targets: POSTAR2
miRNA binding targets: TargetScan

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
R		R
data		data
man		man
results		results
DESCRIPTION		DESCRIPTION
NAMESPACE		NAMESPACE
README.html		README.html
README.md		README.md
TDP43_example.html		TDP43_example.html
TDP43_example.md		TDP43_example.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📜 Link to the Paper

Installing

Running the model

Quick Test Run

Full Feature Analysis

Additional Examples

Parameters

Outputs

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

📜 Link to the Paper

Installing

Running the model

Quick Test Run

Full Feature Analysis

Additional Examples

Parameters

Outputs

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages