Skip to content

HSTE21/ECG-Adversarial-Robustness

Repository files navigation

ECG Adversarial Robustness — PGD Attack & Retraining

"Small Changes, Big Errors" — An experimental study of adversarial attacks and adaptive defenses on a deep learning ECG arrhythmia classifier.

This repository contains the complete pipeline for evaluating and improving the adversarial robustness of a ResNet-based ECG classifier using Projected Gradient Descent (PGD) attacks and adversarial retraining. The main script is the Jupyter Notebook Adversarial Robustness and Retraining V5.ipynb.


Table of Contents


Project Overview

Deep learning models for ECG analysis are increasingly deployed in clinical decision support systems. However, they are vulnerable to adversarial perturbations — subtle, carefully crafted modifications to an input signal that are nearly indistinguishable to a human expert, yet reliably fool the model into producing an incorrect prediction.

This project investigates:

  1. The vulnerability of a pre-trained 1D ResNet ECG classifier to a targeted white-box PGD attack that forces Atrial Fibrillation (AF) samples to be misclassified as Normal Sinus Rhythm.
  2. Static Adversarial Training (Defense 1) — retraining on a fixed set of pre-generated adversarial examples.
  3. Adaptive Adversarial Training (Defense 2) — retraining with freshly generated adversarial examples at every epoch, resulting in generalized robustness.

The four rhythm classes used throughout this project are:

Symbol Class Integer Label
A Atrial Fibrillation 0
N Normal Sinus Rhythm 1
O Other Rhythm 2
~ Noisy / Unclassifiable 3

Repository Structure

ECG-Adversarial-Robustness/
├── Adversarial Robustness and Retraining V5.ipynb  # Main script — full pipeline
├── ResNet_30s_34lay_16conv.hdf5                    # Pre-trained baseline ResNet model
├── ResNet_ROBUST_AF_PGD.hdf5                       # Model after Static Adversarial Training
├── ResNet_ROBUST_ADAPTIVE.hdf5                     # Model after Adaptive Adversarial Training
├── REFERENCE_validation_data_v3.csv                # Validation set labels (record, label)
├── figures/
│   ├── baseline_clean.png                          # Confusion matrix — baseline on clean data
│   ├── baseline_attacked.png                       # Confusion matrix — baseline under PGD attack
│   ├── attack_visual.png                           # Original ECG vs adversarial example visualization
│   ├── static_learning_curves.png                  # Training curves — Static Defense (75 epochs)
│   ├── confusion_matrix_static.png                 # Confusion matrix — Static Defense on clean data
│   ├── static_matrix_attacked.png                  # Confusion matrix — Static Defense under fresh attack
│   ├── static_stress_test.png                      # Epsilon sweep stress test — Static Defense
│   ├── adaptive_learning_curves.png                # Training curves — Adaptive Defense (150 epochs)
│   ├── final_clean_matrix.png                      # Confusion matrix — Adaptive Defense on clean data
│   └── final_robustness_curve.png                  # Robustness curve across epsilon values
├── training_data/                                  # 8,528 ECG recordings (.mat + .hea pairs)
└── validation_data/                                # 300 ECG recordings (.mat + .hea pairs)

Note: The training_data/ directory contains 8,528 paired .mat and .hea recordings from the PhysioNet 2017 challenge. The validation_data/ directory contains 300 validation samples in the same format. Download and unzip the datasets directly into these folders (see Dataset section for instructions).


Dataset

PhysioNet/CinC Challenge 2017AF Classification from a Short Single Lead ECG Recording

  • 8,528 single-lead ECG recordings labeled into 4 rhythm classes.
  • Recording duration: 9 to 61 seconds, sampled at 300 Hz.
  • Acquired with a handheld AliveCor device.
Class Label Count Percentage
Normal N 5,154 60.4%
AF A 771 9.0%
Other O 2,557 30.0%
Noisy ~ 46 0.5%

The dataset is heavily imbalanced. The pipeline applies class weights and uses a balanced sampling strategy (maximum 300 samples per class) during all training phases to prevent the model from defaulting to the majority class.

Download

  1. Go to https://physionet.org/content/challenge-2017/1.0.0/
  2. Download the training set and unzip into training_data/
  3. Download the validation set and unzip into validation_data/

The expected structure inside each folder:

  • .mat and .hea file pairs for each recording (e.g., A00001.mat, A00001.hea)
  • REFERENCE.csv contains the correct rhythm label for each recording
  • RECORDS.txt lists all recording filenames and their identifiers

Model

The baseline model is the 1D ResNet architecture by Andreotti et al. (2017), adapted from the 34-layer ResNet proposed by Rajpurkar et al. for arrhythmia detection:

  • Architecture: 34-layer 1D ResNet with residual skip connections.
  • Filters: 16 convolutional filters per layer.
  • Input: Raw Z-score normalized ECG segment, fixed length of 9,000 samples (30 seconds at 300 Hz), zero-padded or truncated as needed.
  • Output: Softmax over 4 rhythm classes.
  • Pre-trained weights: Provided as ResNet_30s_34lay_16conv.hdf5.

The model operates directly on raw ECG data without hand-crafted features such as HRV. Pre-trained weights are sourced from the original open-source release, allowing this project to focus entirely on adversarial attacks and defenses.


Pipeline

The notebook is organized into the following sequential steps:

Step Description
1 Configuration — file paths, preprocessing parameters, PGD hyperparameters
2 Imports & GPU verification — TensorFlow 2.14.0, MirroredStrategy multi-GPU setup
3 Helper functions.mat ECG loader, Z-score normalization, label mapping
4 Metrics & visualization — per-class F1-score tables and confusion matrix heatmaps
5 Data loading — balanced validation set + balanced training subset (max 300/class)
6 Model loading — baseline ResNet compiled with Legacy Adam optimizer
7 PGD attack function — targeted iterative gradient ascent with ε-ball projection
8 Phase 1 — Baseline evaluation — clean and adversarial performance of the original model
9 Phase 2 — Static Adversarial Training — fine-tune for 75 epochs on a fixed adversarial dataset → ResNet_ROBUST_AF_PGD.hdf5
10 Static defense evaluation — confusion matrix + epsilon sweep stress test
11 Phase 3 — Adaptive Adversarial Training — 150 epochs with fresh attacks generated per epoch → ResNet_ROBUST_ADAPTIVE.hdf5
12 Final evaluation — confusion matrix on clean data + robustness curve across epsilon values

Attack Configuration

A Targeted White-box PGD Attack is implemented. The adversary has full access to the model's gradients and explicitly optimizes the input perturbation to force AF samples to be predicted as Normal Sinus Rhythm.

Parameter Value Description
EPSILON 0.2 Maximum perturbation budget (L∞ norm per time step)
EPS_STEP 0.02 Gradient step size per PGD iteration
MAX_ITER 20 Number of PGD iterations (10 during adaptive training)
Target class 1 (N) Model is forced to predict Normal Sinus Rhythm

The perturbation magnitude (ε = 0.2) is intentionally chosen to mimic plausible high-frequency sensor artifacts or electromyographic (EMG) interference, preserving the global signal morphology and clinical plausibility to a human observer.


Defense Strategies

Defense 1 — Static Adversarial Training

Adversarial examples are generated once using the baseline model and stored as a fixed training set. The model is then fine-tuned on a balanced mixture of clean and adversarial samples.

  • Training data per epoch: 300 Normal + 300 Other + 46 Noisy + 150 clean AF + 150 adversarial AF = 946 samples
  • Epochs: 75
  • Optimizer: Adam (Legacy), lr = 1e-4
  • Class weights: Applied to penalize misclassification of minority classes
  • Output model: ResNet_ROBUST_AF_PGD.hdf5

Defense 2 — Adaptive Adversarial Training

Adversarial examples are regenerated at the start of every epoch using the current model weights. This forces the model to learn generalized robustness against the PGD attack process rather than memorizing specific noise patterns.

  • Same balanced composition as Defense 1
  • Epochs: 150
  • Robust AF recall on the validation set is monitored every epoch as the primary robustness indicator
  • Output model: ResNet_ROBUST_ADAPTIVE.hdf5

Key Results

Phase AF F1 — Clean data AF F1 — Adversarial (ε = 0.2)
Baseline (original model) 0.7342 0.1818 (collapse)
Static Defense (75 epochs) 0.8333 0.9111
Adaptive Defense (150 epochs) 0.8667 robust across all ε

Key observations:

  • The baseline model's AF F1-score collapses from 0.73 to 0.18 under the PGD attack, with the majority of AF samples misclassified as Normal.
  • The static defense achieves an AF F1 of 0.91 on known attacks but fails against fresh adversarial examples generated from its own updated weights, demonstrating that it memorizes rather than generalizes.
  • The adaptive defense achieves the highest clean-data performance (0.87), surpassing both the baseline and the static defense, while maintaining high robustness across all tested epsilon values. Adversarial training acted as a powerful regularizer that smoothed decision boundaries and improved performance on ambiguous AF samples.

Requirements

pip install tensorflow==2.14.0 numpy pandas scipy matplotlib seaborn scikit-learn

Important: TensorFlow 2.14.0 is required for CUDA 11 compatibility. Newer TensorFlow versions (≥ 2.15) require CUDA 12+.

Recommended setup using Conda:

conda create -n ecg-adv python=3.10
conda activate ecg-adv
pip install tensorflow==2.14.0 numpy pandas scipy matplotlib seaborn scikit-learn jupyter

Or using Micromamba:

micromamba create -n ecg-adv python=3.10
micromamba activate ecg-adv
pip install tensorflow==2.14.0 numpy pandas scipy matplotlib seaborn scikit-learn jupyter

How to Run

  1. Clone the repository:

    git clone https://github.com/HSTE21/ECG-Adversarial-Robustness.git
    cd ECG-Adversarial-Robustness
  2. Install dependencies (see Requirements).

  3. Download the dataset and place the files in training_data/ and validation_data/ (see Dataset).

  4. Launch the notebook:

    jupyter notebook "Adversarial Robustness and Retraining V5.ipynb"
  5. Run all cells sequentially. Each phase builds on the previous one.

To skip retraining and use the pre-trained defense models directly, ensure ResNet_ROBUST_AF_PGD.hdf5 and ResNet_ROBUST_ADAPTIVE.hdf5 are present in the root directory and navigate to the relevant evaluation cells.


Hardware Notes

Experiments were conducted on a Dell R750 cluster node equipped with dual NVIDIA A10 GPUs (24 GB VRAM each, Compute Capability 8.6). The notebook uses tf.distribute.MirroredStrategy for parallelized gradient computation across both GPUs.

Component Specification
GPUs 2× NVIDIA A10 (24 GB each)
TensorFlow 2.14.0
CUDA 11.x
Optimizer Adam (Legacy, for multi-GPU stability)
Adaptive runtime ~10 minutes (150 epochs, ~2 s/epoch)

CPU-only execution is supported but will be significantly slower, particularly during the adaptive training phase due to repeated on-the-fly adversarial example generation.


References

This research builds upon the following foundational works:

  1. He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep Residual Learning for Image Recognition. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

  2. Rajpurkar, P., Hannun, A. Y., Haghpanahi, M., Bourn, C., & Ng, A. Y. (2017). Cardiologist-level arrhythmia detection with convolutional neural networks. arXiv preprint arXiv:1703.06870.

  3. Andreotti, F., Belius, I., Gavenski, S., Cowie, M., Kanic, K., Iber, C. (2017). A deep neural network for automatic detection of short-long RR intervals: the PhysioNet/Computing in Cardiology Challenge 2017. Computing in Cardiology (CinC).

  4. Goodfellow, I. J., Shlens, J., & Szegedy, C. (2015). Explaining and Harnessing Adversarial Examples. In International Conference on Learning Representations (ICLR).

  5. Madry, A., Makelov, A., Schmidt, L., Tsipras, D., & Vladu, A. (2019). Towards Deep Learning Models Resistant to Adversarial Attacks. In International Conference on Learning Representations (ICLR).

  6. Goldberger, A. L., Amaral, L. A., Glass, L., & Hausdorff, J. M. (2000). PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation, 101(23), e215–e220.

  7. Moody, G. B., & Mark, R. G. (2017). The PhysioNet/Computing in Cardiology Challenge 2017: AF Classification from a Short Single Lead ECG Recording. Computing in Cardiology (CinC).


License

This project is licensed under the GNU General Public License v3.0 (GPLv3). See LICENSE for details.

Summary: You are free to use, modify, and distribute this code, provided that:

  • Any derivative work is also released under GPLv3
  • The original copyright and license are preserved
  • Changes are documented

For more information, visit GNU GPLv3.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors