NLP ROAR Interpretability
Official code for: Evaluating the Faithfulness of Importance Measures in NLP by Recursively Masking Allegedly Important Tokens and Retraining
Install
git clone https://github.com/AndreasMadsen/nlp-roar-interpretability.git
cd nlp-roar-interpretability
python -m pip install -e .
Experiments
Tasks
There are scripts for each dataset. Note that some tasks share a dataset. Use this list to identify how to train a model for each task.
- SST:
python experiments/stanford_sentiment.py - SNLI:
python experiments/stanford_nli.py - IMDB:
python experiments/imdb.py - MIMIC (Diabetes):
python experiments/mimic.py --subset diabetes - MIMIC (Anemia):
python experiments/mimic.py --subset anemia - bABI-1:
python experiments/babi.py --task 1 - bABI-2:
python experiments/babi.py --task 2 - bABI-3:
python experiments/babi.py --task 3
Parameters
Each of the above scripts stanford_sentiment, stanford_nli, imdb, mimic, and babi take the same set of CLI arguments. You can learn about each argument with --help. The most important arguments which will allow you to run the experiments presented in the paper are:
--importance-measure: this specifies which importance measure is used. It can be eitherrandom,mutual-information,attention,gradient, orintegrated-gradient.--seed: specifies the seed used to initialize the model.--roar-strategy: should ROAR masking be done absoloute (count) or relative (quantile),--k: the proportion of tokens in % to mask if--roar-strategy quantileis used. The number of tokens if--roar-strategy countis used.--recursive: indicates that model to use for computing the importance measure has--kset to--k---recursive-step-sizeinstead of0as used in classic ROAR.
Note, for --k > 0, the reference model must already be trained. For example, in the non-recursive case, this means that a model trained with --k 0 must already available.
Running on a HPC setup
For downloading dataset dependencies we provide a download.sh script.
Additionally, we provide script for submitting all jobs to a Slurm queue, in batch_jobs/. Note again, that the ROAR script assume there are checkpoints for the baseline --k 0 models.
The jobs automatically use $SCRATCH/nlproar as the presistent dir.
MIMIC
See https://mimic.physionet.org/gettingstarted/access/ for how to access MIMIC. You will need to download DIAGNOSES_ICD.csv.gz and NOTEEVENTS.csv.gz and place them in mimic/ relative to your presistent dir.
