HiRID data pre-processing

Repository for the HiRID data pre-processing (for irregular setting = imputation was not conducted)

Setup

conda env & code

# conda env
conda create --name ehr_preproc python=3.10.18
conda activate ehr_preproc
pip install jupyter
pip install pandas 
pip install numpy 
pip install matplotlib 
pip install scikit-learn

# code
git clone https://github.com/Jeong-Eul/HiRID_PREPROC
cd HiRID_PREPROC/hirid

Fast Run

This repository preprocesses data in a way that maximally preserves the underlying causal structure. When constructing multivariate time series across all patients, we adhere to the following principles:

$X_t$ is observed, and $A_t$ (the treatment or action) is determined based on it.
$Y_t$ is determined using $X_t$ and the previous action $A_{t-1}$.

Following these causal consistency rules, our code performs shock early prediction labeling within an 8-hour horizon, using the same annotation and labeling criteria as defined in the original paper.

python processing_hirid.py 
    --observation_path [your path] 
    --pharma_path [your path] 
    --general_path [your path] 
    --obs_feature_path [your path] 
    --pharma_feature_path [your path]
    --save_pth [your path]
    --labeled_save_pth [your path] 
    --Action default=True  
    --min_seq_len [seq length minimum] 
    --max_seq_len [seq length maximum] 
    --prediction_window [prediction window (hour)]
    --iqr_factor 1.5 
    --start_los [los min (hour)] 
    --end_los [los max (hour)]

When you download the HiRID dataset, set the path corresponding to raw_stage/observation_tables as observation_path, and the path corresponding to raw_stage/pharma_records as pharma_path.;
Prepare a list of the variable names and IDs you wish to include, and place it under obs_feature_path (do the same for the pharma variables under pharma_feature_path). ;
Specify the directory where part-wise aggregation results will be saved (save_pth). ;
Specify the directory where part-wise annotation and labeling results will be saved (labeled_save_pth). ;
You can define the length of stay range for patients to be included in the analysis using start_los and end_los. ;
The argument Action is treatment behavior. for example, if vasopressor volumn is higher than previous time point, action is "up" (Different option is concluded: "maintain", "down", "stop"). if Action is False, treatment parameter is saved as binary indicator ;

if you start with default setting, use bellow command:

python processing_hirid.py

Data preprocessing step

This code performs preprocessing in the following order:

Step 1 – Aggregation: Vital and laboratory indicators are integrated with pharmaceutical information to construct multivariate time series. During this process, data preprocessing is carried out while preserving the underlying causal structure.
Step 2 – Annotation/Labeling: Based on the aggregated results, target labels for prediction are generated. This step includes setting the minimum and maximum sequence lengths and applying masking operations.

Citing

Original paper:

@article{hyland2020early,
  title={Early prediction of circulatory failure in the intensive care unit using machine learning},
  author={Hyland, Stephanie L and Faltys, Martin and H{\"u}ser, Matthias and Lyu, Xinrui and Gumbsch, Thomas and Esteban, Crist{\'o}bal and Bock, Christian and Horn, Max and Moor, Michael and Rieck, Bastian and others},
  journal={Nature medicine},
  volume={26},
  number={3},
  pages={364--373},
  year={2020},
  publisher={Nature Publishing Group US New York}
}

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
hirid		hirid
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HiRID data pre-processing

Setup

conda env & code

Fast Run

Data preprocessing step

Citing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

HiRID data pre-processing

Setup

conda env & code

Fast Run

Data preprocessing step

Citing

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages