ImputationReport

What

This package provides a general function imputation_test() that will produce a simple report for different imputation methods for your specific data.

It takes complete data (i.e., features with no missing values) from your data and introduces random missing values at 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, and 90%. We then perform the following imputation methods:

1/5th of lowest detected value (5th)
left censored missing data (lcmd)
k-nearest neighbours (KNN; does not work with >50% missingness - mean is used for >50% missingness)
probabilistic PCA (PPCA; does not work with complete sample missingness)
median
mean
random forest (RF)

After imputation we compare the actual value with the imputed value using root-mean-square error (RMSE) and R². NOTE: R² is not reflective of prediction accuracy but can indicate the correlation between the actual and imputed values. RMSE is calculated as:

$\sqrt{\text{mean}\left((\text{actual} - \text{predicted})^2\right)}$

Where $actual$ is the value from the complete data prior to replacement with NA and $predicted$ is the imputed value of said missing data. The figure below gives values for all models at all % missing and the table shows the most accurate model for each % missing tested; the lower the RMSE the better the model fit; the higher the R² the more correlated the actual and imputed values are.

How

A single function takes a dataframe, the location where you want to save the report, and a label (subtitle) to attach to the report. The report and a cache will be saved in the location provided. A simulated dataframe (data(data_features)) and a report generated from this are provided as an example. A preview of the report can be seen here

ImputationReport::imputation_test(
  data = ImputationReport::data_features, 
  output_dir = "inst/test/", 
  knit_root_dir = here::here(), 
  subtitle = "test")

References

left-censored missing data imputation performed using {imputeLCMD}
k nearest neighbours imputation performed using {impute}
probabilistic PCA performed using {pcaMethods}
median imputation performed using {missMethods}
mean imputation performed using {missMethods}
random forest imputation performed using {missForest}

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
R		R
data		data
inst		inst
man		man
DESCRIPTION		DESCRIPTION
NAMESPACE		NAMESPACE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ImputationReport

What

How

References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ImputationReport

What

How

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages