TransPrompt - Towards an Automatic Transferable Prompting Framework for Few-shot Text Classification

Overview

TransPrompt

This code is implement for our EMNLP 2021's paper 《TransPrompt:Towards an Automatic Transferable Prompting Framework for Few-shot Text Classification》.

Our proposed TransPrompt is motivated by the join of prompt-tuning and cross-task transfer learning. The aim is to explore and exploit the transferable knowledge from similar tasks in the few-shot scenario, and make the Pre-trained Language Model (PLM) better few-shot transfer learner. Our proposed framework is accepted by the main conference (long paper track) in EMNLP-2021. This code is the default multi-GPU version. We will teach you how to use our code in the following parts.

Ps: We also commit the same code in Alibaba EasyTransfer.

1. Data Preparation

We follow PET to use the same dataset. Please run the scripts to download the data:

sh data/download_data.sh

or manually download the dataset from https://nlp.cs.princeton.edu/projects/lm-bff/datasets.tar.

Then you will obtain a new director data/original

Our work has two kind of scenario, such as single-task and cross-task. Different kind scenario has corresponding splited examples. Defaultly, we generate few-shot learning examples, you can also generate full data by edit the parameter (-scene=full). We only demostrate the few-shot data generation.

1.1 Single-task Few-shot

Please run the scripts to obtain the single-task few-shot examples:

python3 data_utils/generate_k_shot_data.py --scene few-shot --k 16

Then you will obtain a new folder data/k-shot-single

1.2 Cross-task Few-shot

Run the scripts

python3 data_utils/generate_k_shot_cross_task_data.py --scene few-shot --k 16

and you will obtain a new folder data/k-shot-cross

After the generation, the similar tasks will be divided into the same group. We have three groups:

  • Group1 (Sentiment Analysis): SST-2, MR, CR
  • Group2 (Natural Language Inference): MNLI, SNLI
  • Group3 (Paraphrasing): MRPC, QQP

2. Have a Training Games

Please follow our papers, we have mask following experiments:

  • Single-task few-shot learning: It is the same as LM-BFF and P-tuning, we prompt-tune the PLM only on one task.
  • Cross-task few-shot learning: We mix up the similar task in group. At first, we prompt-tune the PLM on cross-task data, then we prompt-tune on each task again. For the Cross-task Learning, we have two cross-task method:
  • (Cross-)Task Adaptation: In one group, we prompt-tune on all the tasks, and then evaluate on each task both in few-shot scenario.
  • (Cross-)Task Generalization: In one group, we randomly choose one task for few-shot evaluation (do not used for training), others are used for prompt-tuning.

2.1 Single-task few-shot learning

Take MRPC as an example, please run:

CUDA_VISIBLE_DEVICES=0 sh scripts/run_single_task.sh

figure1.png

2.2 Cross-task few-shot Learning (Task Adaptaion)

Take Group1 as an example, please run the scripts:

CUDA_VISIBLE_DEVICES=0 sh scripts/run_cross_task_adaptation.sh

figure2.png

2.3 Cross-task few-shot Learning (Task Generalization)

Also take Group1 as an example, please run the scripts: Ps: the unseen task is SST-2.

CUDA_VISIBLE_DEVICES=0 sh scripts/run_cross_task_generalization.sh

figure3.png

Citation

Our paper citation is:

@inproceedings{DBLP:conf/emnlp/0001WQH021,
  author    = {Chengyu Wang and
               Jianing Wang and
               Minghui Qiu and
               Jun Huang and
               Ming Gao},
  editor    = {Marie{-}Francine Moens and
               Xuanjing Huang and
               Lucia Specia and
               Scott Wen{-}tau Yih},
  title     = {TransPrompt: Towards an Automatic Transferable Prompting Framework
               for Few-shot Text Classification},
  booktitle = {Proceedings of the 2021 Conference on Empirical Methods in Natural
               Language Processing, {EMNLP} 2021, Virtual Event / Punta Cana, Dominican
               Republic, 7-11 November, 2021},
  pages     = {2792--2802},
  publisher = {Association for Computational Linguistics},
  year      = {2021},
  url       = {https://aclanthology.org/2021.emnlp-main.221},
  timestamp = {Tue, 09 Nov 2021 13:51:50 +0100},
  biburl    = {https://dblp.org/rec/conf/emnlp/0001WQH021.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

Acknowledgement

The code is developed based on pet. We appreciate all the authors who made their code public, which greatly facilitates this project. This repository would be continuously updated.

Owner
WangJianing
My name is Wang Jianing.Nowadays I am a postgraduate of East China Normal University in Shanghai.My research field is Machine Learning;Deep Learning and NLP
WangJianing
This repository contains the source code and data for reproducing results of Deep Continuous Clustering paper

Deep Continuous Clustering Introduction This is a Pytorch implementation of the DCC algorithms presented in the following paper (paper): Sohil Atul Sh

Sohil Shah 197 Nov 29, 2022
This repository contains the map content ontology used in narrative cartography

Narrative-cartography-ontology This repository contains the map content ontology used in narrative cartography, which is associated with a submission

Weiming Huang 0 Oct 31, 2021
This repository contains the DendroMap implementation for scalable and interactive exploration of image datasets in machine learning.

DendroMap DendroMap is an interactive tool to explore large-scale image datasets used for machine learning. A deep understanding of your data can be v

DIV Lab 33 Dec 30, 2022
This is the code for Compressing BERT: Studying the Effects of Weight Pruning on Transfer Learning

This is the code for Compressing BERT: Studying the Effects of Weight Pruning on Transfer Learning It includes /bert, which is the original BERT repos

Mitchell Gordon 11 Nov 15, 2022
Highway networks implemented in PyTorch.

PyTorch Highway Networks Highway networks implemented in PyTorch. Just the MNIST example from PyTorch hacked to work with Highway layers. Todo Make th

Conner Vercellino 56 Dec 14, 2022
Code for the Higgs Boson Machine Learning Challenge organised by CERN & EPFL

A method to solve the Higgs boson challenge using Least Squares - Novae This project is the Project 1 of EPFL CS-433 Machine Learning. The project is

Giacomo Orsi 1 Nov 09, 2021
This repository contains the source codes for the paper AtlasNet V2 - Learning Elementary Structures.

AtlasNet V2 - Learning Elementary Structures This work was build upon Thibault Groueix's AtlasNet and 3D-CODED projects. (you might want to have a loo

Théo Deprelle 123 Nov 11, 2022
PyTorch implementation of DirectCLR from paper Understanding Dimensional Collapse in Contrastive Self-supervised Learning

DirectCLR DirectCLR is a simple contrastive learning model for visual representation learning. It does not require a trainable projector as SimCLR. It

Meta Research 49 Dec 21, 2022
Retinal vessel segmentation based on GT-UNet

Retinal vessel segmentation based on GT-UNet Introduction This project is a retinal blood vessel segmentation code based on UNet-like Group Transforme

Kent0n 27 Dec 18, 2022
Towards Improving Embedding Based Models of Social Network Alignment via Pseudo Anchors

PSML paper: Towards Improving Embedding Based Models of Social Network Alignment via Pseudo Anchors PSML_IONE,PSML_ABNE,PSML_DEEPLINK,PSML_SNNA: numpy

13 Nov 27, 2022
Source code for CIKM 2021 paper for Relation-aware Heterogeneous Graph for User Profiling

RHGN Source code for CIKM 2021 paper for Relation-aware Heterogeneous Graph for User Profiling Dependencies torch==1.6.0 torchvision==0.7.0 dgl==0.7.1

Big Data and Multi-modal Computing Group, CRIPAC 6 Nov 29, 2022
Modular Probabilistic Programming on MXNet

MXFusion | | | | Tutorials | Documentation | Contribution Guide MXFusion is a modular deep probabilistic programming library. With MXFusion Modules yo

Amazon 100 Dec 10, 2022
A Fast Sequence Transducer Implementation with PyTorch Bindings

transducer A Fast Sequence Transducer Implementation with PyTorch Bindings. The corresponding publication is Sequence Transduction with Recurrent Neur

Awni Hannun 184 Dec 18, 2022
Runtime type annotations for the shape, dtype etc. of PyTorch Tensors.

torchtyping Type annotations for a tensor's shape, dtype, names, ... Turn this: def batch_outer_product(x: torch.Tensor, y: torch.Tensor) - torch.Ten

Patrick Kidger 1.2k Jan 03, 2023
Replication attempt for the Protein Folding Model

RGN2-Replica (WIP) To eventually become an unofficial working Pytorch implementation of RGN2, an state of the art model for MSA-less Protein Folding f

Eric Alcaide 36 Nov 29, 2022
Enigma-Plus - Python based Enigma machine simulator with some extra features

Enigma-Plus Python based Enigma machine simulator with some extra features Examp

1 Jan 05, 2022
Python port of R's Comprehensive Dynamic Time Warp algorithm package

Welcome to the dtw-python package Comprehensive implementation of Dynamic Time Warping algorithms. DTW is a family of algorithms which compute the loc

Dynamic Time Warping algorithms 154 Dec 26, 2022
3rd place solution for the Weather4cast 2021 Stage 1 Challenge

weather4cast2021_Stage1 3rd place solution for the Weather4cast 2021 Stage 1 Challenge Dependencies The code can be executed from a fresh environment

5 Aug 14, 2022
My Body is a Cage: the Role of Morphology in Graph-Based Incompatible Control

My Body is a Cage: the Role of Morphology in Graph-Based Incompatible Control

yobi byte 29 Oct 09, 2022