SeqAttack: a framework for adversarial attacks on token classification models

Last update: Nov 25, 2022

Related tags

Overview

SeqAttack: a framework for adversarial attacks on token classification models

SeqAttack is a framework for conducting adversarial attacks against Named Entity Recognition (NER) models and for data augmentation. This library is heavily based on the popular TextAttack framework, and can similarly be used for:

Understanding models by running adversarial attacks against them and observing their shortcomings
Develop new attack strategies
Guided data augmentation, generating additional training samples that can be used to fix a model's shortcomings

The SeqAttack paper is available here.

Setup

Run pip install -r requirements.txt and you're good to go! If you want to run experiments on a fresh virtual machine, check out scripts/gcp.sh which installs all system dependencies for running the code.

The code was tested with python 3.7, if you're using a different version your mileage may vary.

Usage

The main features of the framework are available via the command line interface, wrapped by cli.py. The following subsections describe the usage of the various commands.

Attack

Attacks are executed via the python cli.py attack subcommand. Attack commands are split in two parts:

General setup: options common to all adversarial attacks (e.g. model, dataset...)
Attack specific setup: options specific to a particular attack strategy

Thus, a typical attack command might look like the following:

python cli.py attack [general-options] attack-recipe [recipe-options]

For example, if we want to attack dslim/bert-base-NER, a NER model trained on CoNLL2003 using deepwordbug as the attack strategy we might run:

python cli.py attack                                            \
       --model-name dslim/bert-base-NER                         \
       --output-path output-dataset.json                        \
       --cache                                                  \
       --dataset-config configs/conll2003-config.json           \
       deepwordbug

The dataset configuration file, configs/conll2003-config.json defines:

The dataset path or name (in the latter case it will be downloaded from HuggingFace)
The split (e.g. train, test). Only for HuggingFace datasets
The human-readable names (a mapping between numerical labels and textual labels), given as a list
A labels map, used to remap the dataset's ground truth to align it with the model output as needed. This field can be null if no remapping is needed

In the example above, labels_map is used to align the dataset labels to the output from dslim/bert-base-NER. The dataset labels are the following:

O (0), B-PER (1), I-PER (2), B-ORG (3), I-ORG (4) B-LOC (5), I-LOC (6) B-MISC (7), I-MISC (8)

while the model labels are:

O (0), B-MISC (1), I-MISC (2), B-PER (3), I-PER (4) B-ORG (5), I-ORG (6) B-LOC (7), I-LOC (8)

Thus a remapping is needed and labels_map takes care of it.

The available attack strategies are the following:

Attack Strategy	Transformation	Constraints	Paper
BAE	word swap	USE sentence cosine similarity	https://arxiv.org/abs/2004.01970
BERT-Attack	word swap	USE sentence cosine similarity, Maximum words perturbed	https://arxiv.org/abs/2004.09984
CLARE	word swap and insertion	USE sentence cosine similarity	https://arxiv.org/abs/2009.07502
DeepWordBug	character insertion, deletion, swap (ab --> ba) and substitution	Levenshtein edit distance	https://arxiv.org/abs/1801.04354
Morpheus	inflection word swap		https://www.aclweb.org/anthology/2020.acl-main.263.pdf
SCPN	paraphrasing		https://www.aclweb.org/anthology/N18-1170
TextFooler	word swap	USE sentence cosine similarity, POS match, word-embedding distance	https://arxiv.org/abs/1907.11932

The table above is based on this table. In addition to the constraints shown above the attack strategies are also forbidden from modifying and inserting named entities by default.

Evaluation

To evaluate a model against a standard dataset run:

python cli.py evaluate                  \
       --model dslim/bert-base-NER      \
       --dataset conll2003              \
       --split test                     \
       --mode strict                    \

To evaluate the effectivenes of an attack run the following command:

python cli.py evaluate                                  \
       --model dslim/bert-base-NER                      \
       --attacked-dataset experiments/deepwordbug.json  \
       --mode strict                                    \

The above command will compute and display the metrics for the original predictions and their adversarial counterparts.

The evaluation is based on seqeval

Dataset selection

Given a dataset, our victim model may be able to predict some dataset samples perfectly, but it may produce significant errors on others. To evaluate an attack's effectiveness we may want to select samples with a small initial misprediction score. This can be done via the following command:

python cli.py pick-samples                              \
       --model dslim/bert-base-NER                      \
       --dataset-config configs/conll2003-config.json   \
        --max-samples 256                               \
       --max-initial-score 0.5                          \ # The maximum initial misprediction score
       --output-filename cherry-picked.json             \
       --goal-function untargeted

Tests

Tests can be run with pytest

Adversarial examples visualization

The output datasets can be visualized with SeqAttack-Visualization

SeqAttack: a framework for adversarial attacks on token classification models

Related tags

Overview

SeqAttack: a framework for adversarial attacks on token classification models

Setup

Usage

Attack

Evaluation

Dataset selection

Tests

Adversarial examples visualization

Owner

Walter

Code corresponding to The Introspective Agent: Interdependence of Strategy, Physiology, and Sensing for Embodied Agents

Efficiently Disentangle Causal Representations

PyTorch implementation of Memory-based semantic segmentation for off-road unstructured natural environments.

Deep Image Matting implementation in PyTorch

A stock generator that assess a list of stocks and returns the best stocks for investing and money allocations based on users choices of volatility, duration and number of stocks

[ICCV'21] Neural Radiance Flow for 4D View Synthesis and Video Processing

WSDM‘2022: Knowledge Enhanced Sports Game Summarization

Code for ICMI2020 and ICMI2021 papers: "Studying Person-Specific Pointing and Gaze Behavior for Multimodal Referencing of Outside Objects from a Moving Vehicle" and "ML-PersRef: A Machine Learning-based Personalized Multimodal Fusion Approach for Referencing Outside Objects From a Moving Vehicle"

Source Code and data for my paper titled Linguistic Knowledge in Data Augmentation for Natural Language Processing: An Example on Chinese Question Matching

python debugger and anti-vm that checks if you're in a virtual machine or if someones trying to debug your file

Implementation of Hire-MLP: Vision MLP via Hierarchical Rearrangement and An Image Patch is a Wave: Phase-Aware Vision MLP.

【steal piano】GitHub偷情分析工具！

Source code for the NeurIPS 2021 paper "On the Second-order Convergence Properties of Random Search Methods"

This is an example implementation of the paper "Cross Domain Robot Imitation with Invariant Representation".

ADB-IP-ROTATION - Use your mobile phone to gain a temporary IP address using ADB and data tethering

PyTorch implementation of UPFlow (unsupervised optical flow learning)

Complete* list of autonomous driving related datasets

Finite-temperature variational Monte Carlo calculation of uniform electron gas using neural canonical transformation.

RGBD-Net - This repository contains a pytorch lightning implementation for the 3DV 2021 RGBD-Net paper.

Using deep learning to predict gene structures of the coding genes in DNA sequences of Arabidopsis thaliana