This is an implementation of the paper "On Concept-Based Explanations in Deep Neural Networks" https://arxiv.org/abs/1910.07969. This specific implementation applies the ConceptSHAP technique to BERT and other transformer-based language models via the Huggingface Transformers library. This implementation was developed by members of Machine Learning @ Berkeley for Intuit's Machine Learning Futures Group in Spring 2020.
git clone https://github.com/arnav-gudibande/intuit-project.gitpip3 install -r requirements.txt
datadata/imdb-dataloader.py-- dataloader for the IMDB Movie Sentiment Dataset, contains options to format test/train datadata/20news-dataloder.py-- dataloader for 20NewsGroups dataset
modelbert-20news.pyandbert-imdb.py-- training scripts for huggingface bert language modelbert_inference.py-- outputs embeddings generated from a trained transformer model for a target dataset
clusteringgenerateClusters.py-- k-means clustering of output embeddings- Note: this was discarded from the intitial conceptSHAP paper, but can still be used to test classical unsupervised methods against conceptSHAP
conceptSHAPconceptNet.py-- trainable subclass that learns conceptstrain_eval.py-- training script forconceptNet.pyinterpretConcepts.py-- post-training concept analysis and tensorboard plotting
- Download and format IMDB Dataset:
sh data/imdb-dataloader.sh - Train BERT model on IMDB:
sh model/bert-imdb.sh - Generate and save BERT embeddings:
sh model/bert-inference_imdb.sh - Run ConceptSHAP:
sh conceptSHAP/train_eval_imdb.sh
- Download and format 20News:
sh data/20news-dataloader.sh - Train BERT model on 20News:
python3 model/bert-20news.py - Generate and save BERT embeddings:
sh model/bert-inference_20news.sh - Run ConceptSHAP:
sh conceptSHAP/train_eval_20news.sh
tensorboard --logdir=runs --port=6006