code for the ICLR'22 paper: On Robust Prefix-Tuning for Text Classification

Overview

On Robust Prefix-Tuning for Text Classification

Prefix-tuning has drawed much attention as it is a parameter-efficient and modular alternative to adapting pretrained language models to downstream tasks. However, we find that prefix-tuning suffers from adversarial attacks. While, unfortunately, current robust NLP methods are unsuitable for prefix-tuning as they will inevitably hamper the modularity of prefix-tuning. In our ICLR'22 paper, we propose robust prefix-tuning for text classification. Our method leverages the idea of test-time tuning, which preserves the strengths of prefix-tuning and improves its robustness at the same time. This repository contains the code for the proposed robust prefix-tuning method.

Prerequisite

PyTorch>=1.2.0, pytorch-transformers==1.2.0, OpenAttack==2.0.1, and GPUtil==1.4.0.

Train the original prefix P_θ

For the training phase of standard prefix-tuning, the command is:

  source train.sh --preseqlen [A] --learning_rate [B] --tasks [C] --n_train_epochs [D] --device [E]

where

  • [A]: The length of the prefix P_θ.
  • [B]: The (initial) learning rate.
  • [C]: The benchmark. Default: sst.
  • [D]: The total epochs during training.
  • [E]: The id of the GPU to be used.

We can also use adversarial training to improve the robustness of the prefix. For the training phase of adversarial prefix-tuning, the command is:

  source train_adv.sh --preseqlen [A] --learning_rate [B] --tasks [C] --n_train_epochs [D] --device [E] --pgd_ball [F]

where

  • [A]~[E] have the same meanings with above.
  • [F]: where norm ball is word-wise or sentence-wise.

Note that the DATA_DIR and MODEL_DIR in train_adv.sh are different from those in train.sh. When experimenting with the adversarially trained prefix P_θ's in the following steps, remember to switch the DATA_DIR and MODEL_DIR in the corresponding scripts as well.

Generate Adversarial Examples

We use the OpenAttack package to generate in-sentence adversaries. The command is:

  source generate_adv_insent.sh --preseqlen [A] --learning_rate [B] --tasks [C] --device [E] --test_ep [G] --attack [H]

where

  • [A],[B],[C],[E] have the same meanings with above.
  • [G]: Load the prefix P_θ parameters trained for [G] epochs for testing. We set G=D.
  • [H]: Generate adversarial examples based on clean test set with the in-sentence attack [H].

We also implement the Universal Adversarial Trigger attack. The command is:

  source generate_adv_uat.sh --preseqlen [A] --learning_rate [B] --tasks [C] --device [E] --test_ep [G] --attack clean-[H2] --uat_len [I] --uat_epoch [J]

where

  • [A],[B],[C],[E],[G] have the same meanings with above.
  • [H2]: We should search for UATs for each class in the benchmark, and H2 indicates the class id. H2=0/1 for SST, 0/1/2/3 for AG News, and 0/1/2 for SNLI.
  • [I]: The length of the UAT.
  • [J]: The epochs for exploiting UAT.

Test the performance of P_θ

The command for performance testing of P_θ under clean data and in-sentence attacks is:

  source test_prefix_theta_insent.sh --preseqlen [A] --learning_rate [B] --tasks [C] --device [E] --test_ep [G] --attack [H] --test_batch_size [K]

Under UAT attack, the test command is:

  source test_prefix_theta_uat.sh --preseqlen [A] --learning_rate [B] --tasks [C] --device [E] --test_ep [G] --attack clean --uat_len [I] --test_batch_size [K]

where

  • [A]~[I] have the same meanings with above.
  • [K]: The test batch size. when K=0, the batch size is adaptive (determined by GPU memory); when K>0, the batch size is fixed.

Robust Prefix P'_ψ: Constructing the canonical manifolds

By constructing the canonical manifolds with PCA, we get the projection matrices. The command is:

  source get_proj.sh --preseqlen [A] --learning_rate [B] --tasks [C] --device [E] --test_ep [G]

where [A]~[G] have the same meanings with above.

Robust Prefix P'_ψ: Test its performance

Under clean data and in-sentence attacks, the command is:

  source test_robust_prefix_psi_insent.sh --preseqlen [A] --learning_rate [B] --tasks [C] --device [E] --test_ep [G] --attack [H] --test_batch_size [K] --PMP_lr [L] --PMP_iter [M]

Under UAT attack, the test command is:

  source test_robust_prefix_psi_uat.sh --preseqlen [A] --learning_rate [B] --tasks [C] --device [E] --test_ep [G] --attack clean --uat_len [I] --test_batch_size [K] --PMP_lr [L] --PMP_iter [M]

where

  • [A]~[K] have the same meanings with above.
  • [L]: The learning rate for test-time P'_ψ tuning.
  • [M]: The iterations for test-time P'_ψ tuning.

Running Example

# Train the original prefix P_θ
source train.sh --tasks sst --n_train_epochs 100 --device 0
source train_adv.sh --tasks sst --n_train_epochs 100 --device 1 --pgd_ball word

# Generate Adversarial Examples
source generate_adv_insent.sh --tasks sst --device 0 --test_ep 100 --attack bug
source generate_adv_uat.sh --tasks sst --device 0 --test_ep 100 --attack clean-0 --uat_len 3 --uat_epoch 10
source generate_adv_uat.sh --tasks sst --device 0 --test_ep 100 --attack clean-1 --uat_len 3 --uat_epoch 10

# Test the performance of P_θ
source test_prefix_theta_insent.sh --tasks sst --device 0 --test_ep 100 --attack bug --test_batch_size 0
source test_prefix_theta_uat.sh --tasks sst --device 0 --test_ep 100 --attack clean --uat_len 3 --test_batch_size 0

# Robust Prefix P'_ψ: Constructing the canonical manifolds
source get_proj.sh --tasks sst --device 0 --test_ep 100

# Robust Prefix P'_ψ: Test its performance
source test_robust_prefix_psi_insent.sh --tasks sst --device 0 --test_ep 100 --attack bug --test_batch_size 0 --PMP_lr 0.15 --PMP_iter 10
source test_robust_prefix_psi_uat.sh --tasks sst --device 0 --test_ep 100 --attack clean --uat_len 3 --test_batch_size 0 --PMP_lr 0.05 --PMP_iter 10

Released Data & Models

The training the original prefix P_θ and the process of generating adversarial examples can be time-consuming. As shown in our paper, the adversarial prefix-tuning is particularly slow. Efforts need to be paid on generating adversaries as well, since different attacks are to be performed on the test set based on each trained prefix. We also found that OpenAttack is now upgraded to v2.1.1, which causes compatibility issues in our codes (test_prefix_theta_insent.py).

In order to facilitate research on the robustness of prefix-tuning, we release the prefix checkpoints P_θ (with both std. and adv. training), the processed test sets that are perturbed by in-sentence attacks (including PWWS and TextBugger), as well as the generated projection matrices of the canonical manifolds in our runs for reproducibility and further enhancement. We have also hard-coded the exploited UAT tokens in test_prefix_theta_uat.py and test_robust_prefix_psi_uat.py. All the materials can be found here.

Acknowledgements:

The implementation of robust prefix tuning is based on the LAMOL repo, which is the code of LAMOL: LAnguage MOdeling for Lifelong Language Learning that studies NLP lifelong learning with GPT-style pretrained language models.

Bibtex

If you find this repository useful for your research, please consider citing our work:

@inproceedings{
  yang2022on,
  title={On Robust Prefix-Tuning for Text Classification},
  author={Zonghan Yang and Yang Liu},
  booktitle={International Conference on Learning Representations},
  year={2022},
  url={https://openreview.net/forum?id=eBCmOocUejf}
}
Owner
Zonghan Yang
Graduate student in Tsinghua University. Two drifters, off to see the world - there's such a lot of world to see...
Zonghan Yang
A memory-efficient implementation of DenseNets

efficient_densenet_pytorch A PyTorch =1.0 implementation of DenseNets, optimized to save GPU memory. Recent updates Now works on PyTorch 1.0! It uses

Geoff Pleiss 1.4k Dec 25, 2022
An efficient framework for reinforcement learning.

rl: An efficient framework for reinforcement learning Requirements Introduction PPO Test Requirements name version Python =3.7 numpy =1.19 torch =1

16 Nov 30, 2022
This is the source code for our ICLR2021 paper: Adaptive Universal Generalized PageRank Graph Neural Network.

GPRGNN This is the source code for our ICLR2021 paper: Adaptive Universal Generalized PageRank Graph Neural Network. Hidden state feature extraction i

Jianhao 92 Jan 03, 2023
Transformers are Graph Neural Networks!

🚀 Gated Graph Transformers Gated Graph Transformers for graph-level property prediction, i.e. graph classification and regression. Associated article

Chaitanya Joshi 46 Jun 30, 2022
The repository forked from NVlabs uses our data. (Differentiable rasterization applied to 3D model simplification tasks)

nvdiffmodeling [origin_code] Differentiable rasterization applied to 3D model simplification tasks, as described in the paper: Appearance-Driven Autom

Qiujie (Jay) Dong 2 Oct 31, 2022
CellRank's reproducibility repository.

CellRank's reproducibility repository We believe that reproducibility is key and have made it as simple as possible to reproduce our results. Please e

Theis Lab 8 Oct 08, 2022
Predicting Semantic Map Representations from Images with Pyramid Occupancy Networks

This is the code associated with the paper Predicting Semantic Map Representations from Images with Pyramid Occupancy Networks, published at CVPR 2020.

Thomas Roddick 219 Dec 20, 2022
This repo provides a demo for the CVPR 2021 paper "A Fourier-based Framework for Domain Generalization" on the PACS dataset.

FACT This repo provides a demo for the CVPR 2021 paper "A Fourier-based Framework for Domain Generalization" on the PACS dataset. To cite, please use:

105 Dec 17, 2022
SPRING is a seq2seq model for Text-to-AMR and AMR-to-Text (AAAI2021).

SPRING This is the repo for SPRING (Symmetric ParsIng aNd Generation), a novel approach to semantic parsing and generation, presented at AAAI 2021. Wi

Sapienza NLP group 98 Dec 21, 2022
Language-Agnostic Website Embedding and Classification

Homepage2Vec Language-Agnostic Website Embedding and Classification based on Curlie labels https://arxiv.org/pdf/2201.03677.pdf Homepage2Vec is a pre-

25 Dec 27, 2022
Repository for paper "Non-intrusive speech intelligibility prediction from discrete latent representations"

Non-Intrusive Speech Intelligibility Prediction from Discrete Latent Representations Official repository for paper "Non-Intrusive Speech Intelligibili

Alex McKinney 5 Oct 25, 2022
WSDM2022 Challenge - Large scale temporal graph link prediction

WSDM 2022 Large-scale Temporal Graph Link Prediction - Baseline and Initial Test Set WSDM Cup Website link Link to this challenge This branch offers A

Deep Graph Library 34 Dec 29, 2022
Chunkmogrify: Real image inversion via Segments

Chunkmogrify: Real image inversion via Segments Teaser video with live editing sessions can be found here This code demonstrates the ideas discussed i

David Futschik 112 Jan 04, 2023
Discord bot for notifying on github events

Git-Observer Discord bot for notifying on github events ⚠️ This bot is meant to write messages to only one channel (implementing this for multiple pro

ilu_vatar_ 0 Apr 19, 2022
Neural network pruning for finding a sparse computational model for controlling a biological motor task.

MothPruning Scientific Overview Originally inspired by biological nervous systems, deep neural networks (DNNs) are powerful computational tools for mo

Olivia Thomas 0 Dec 14, 2022
🌾 PASTIS 🌾 Panoptic Agricultural Satellite TIme Series

🌾 PASTIS 🌾 Panoptic Agricultural Satellite TIme Series (optical and radar) The PASTIS Dataset Dataset presentation PASTIS is a benchmark dataset for

86 Jan 04, 2023
Real-time object detection on Android using the YOLO network with TensorFlow

TensorFlow YOLO object detection on Android Source project android-yolo is the first implementation of YOLO for TensorFlow on an Android device. It is

Nataniel Ruiz 624 Jan 03, 2023
HSC4D: Human-centered 4D Scene Capture in Large-scale Indoor-outdoor Space Using Wearable IMUs and LiDAR. CVPR 2022

HSC4D: Human-centered 4D Scene Capture in Large-scale Indoor-outdoor Space Using Wearable IMUs and LiDAR. CVPR 2022 [Project page | Video] Getting sta

51 Nov 29, 2022
This repository contains the DendroMap implementation for scalable and interactive exploration of image datasets in machine learning.

DendroMap DendroMap is an interactive tool to explore large-scale image datasets used for machine learning. A deep understanding of your data can be v

DIV Lab 33 Dec 30, 2022
A curated list of awesome Machine Learning frameworks, libraries and software.

Awesome Machine Learning A curated list of awesome machine learning frameworks, libraries and software (by language). Inspired by awesome-php. If you

Joseph Misiti 57.1k Jan 03, 2023