Skip to content
/ drXAI Public

Repository for "XAI in Action for Effective Data Reduction on Time Series"

Notifications You must be signed in to change notification settings

mlgig/drXAI

Repository files navigation

XAI in Action for Effective Data Reduction on Time Series

This repository contains code and experimental result for our work "XAI in Action for Effective Data Reduction on Time Series" (under review)

Abstract

Explainable AI (XAI) for time series is an important topic and recent efforts in this area have resulted in many new explanation methods. Additionally, new approaches were proposed to evaluate XAI methods, both quantitatively and qualitatively through domain expertise or user studies. Despite progress in XAI for time series, its practical application for achieving measurable benefits remains unclear. This paper bridges this gap by proposing and evaluating a novel, general methodology for XAI-based data reduction on time series (drXAI). We analyse XAI in action on the task of data reduction for time series classification. We first study channel selection for multivariate time series classification (MTSC). We show that drXAI can reduce the data by up to 88% on big datasets (>50 channels), significantly outperforming state-of-the-art channel selection methods. Next, we apply the drXAI algorithm for time point selection in univariate time series classification (UTSC). We show that our XAI-based algorithm can reduce the data by up to 95% in some cases (>4k length), while preserving or improving the classification accuracy. Finally, we show that XAI-selection can be done using fast classifiers (e.g., Minirocket) and fast XAI methods (e.g., Feature Ablation): this enables the use of more complex and accurate algorithms such as ConvTran, which require significant memory resources for long time series. Our experiments show that XAI can be used effectively for data reduction on the TSC task, with significant and measurable benefits, opening a path to applying XAI to other time series tasks.

Results

Accuracy, hmean and selected features for each classifier of drXAI configuration, univariate and multivariate datasets can be found in results directory as csv.

Data

Used datasets, trained models, XAI results, feature selection and metric results are shared trough zenodo link here

Plots

image

drXAI: Channel selection for MTSC using XAI scores computed from time series attributions.

image

An example of elbow cut for channel selection.

Accuracy vs percentage of used data for original data 𝐷, SOTA channel selection methods ECP, ECS and two configurations of drXAI. All results here are for MTSC datasets and the MiniRocket classifier.

image

(a) MTSC datasets having > 50 channels.

image

(b) MTSC datasets where drXAI improves the original Acc.

image

Accuracy vs percentage of used data for the original data and 4 drXAI configurations. All results here are for UTSC datasets, 3 examples using the MiniRocket classifier and one using ConvTran.

MTSC datasets: 𝑎𝑐𝑐 and ℎ𝑚𝑒𝑎𝑛 grouped by background and explainers and sorted by median value.

image

drXAI variants: 𝑎𝑐𝑐 for MTSC data

image

drXAI variants: hmean for MTSC data

image

Running time (log scale, mins) of FA vs SVS explainers for MTSC data.

UTSC datasets: acc and ℎ𝑚𝑒𝑎𝑛 results grouped by background and explainers. image

acc by Background.

image

acc by Explainer.

image

hmean by Background.

image

hmean by Explainer.

Code

Code developed using Python 3.11.13, all used libraries are listed in requirements.txt

The two executable files are:

get_selection.py

To be used to train models, explain them getting the saliency maps and related selection.

Arguments:

dataset_dir : folder where datasets are stored

saved_models_path : folder where to save models

explainer_results_dir : directory where to save classifiers and attributions info including related selection. Format is one file per dataset

classifier: classifier name is either hydra,miniRocket or ConvTran

batch_size: batch size for training and explaining

random_seed : random seed to be used for reproducibility

--channel_selection : whether to perform channel selection or not (perform time point selection)

--time_points_selection: whether to perform time point selection

compute_metrics.py

To be used to trained new model(s), as described in the paper, and get metrics about them

Arguments:

explanation_dir : dir where explanation results are stored

saved_models_path : folder where models will be saved

dataset_dir : directory where datasets are located

classifier: classifier name is either hydra,miniRocket or ConvTran

batch_size: batch size for training and explaining

result_path : path where to store new accuracies

--elbow_selections_path : file path where elbow selections are saved (implicitly defining whether performing channel or time point selection

About

Repository for "XAI in Action for Effective Data Reduction on Time Series"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages