Research code for "What to Pre-Train on? Efficient Intermediate Task Selection", EMNLP 2021

Overview

efficient-task-transfer

This repository contains code for the experiments in our paper "What to Pre-Train on? Efficient Intermediate Task Selection". Most importantly, this includes scripts for easy training of Transformers and Adapters across a wide range of NLU tasks.

Overview

The repository is structured as follows:

  • itrain holds the itrain package which allows easy setup, training and evaluation of Transformers and Adapters
  • run_configs provides default training configuration of all tasks currently supported by itrain
  • training_scripts provides scripts for sequential adapter fine-tuning and adapter fusion as used in the paper
  • task_selection provides scripts used for intermediate task selection in the paper

Setup & Requirements

The code in this repository was developed using Python v3.6.8, PyTorch v1.7.1 and adapter-transformers v1.1.1, which is based on HuggingFace Transformers v3.5.1. Using version different from the ones specified might not work.

After setting up Python and PyTorch (ideally in a virtual environment), all additional requirements together with the itrain package can be installed using:

pip install -e .

Additional setup steps required for running some scripts are detailed below locations.

Transformer & Adapter Training

The itrain package provides a simple interface for configuring Transformer and Adapter training runs. itrain provides tools for:

  • downloading and preprocessing datasets via HuggingFace datasets
  • setting up Transformers and Adapter training
  • training and evaluating on different tasks
  • notifying on training start and results via mail or Telegram

itrain can be invoked from the command line by passing a run configuration file in json format. Example configurations for all currently supported tasks can be found in the run_configs folder. All supported configuration keys are defined in arguments.py.

Running a setup from the command line can look like this:

itrain --id 42 run_configs/sst2.json

This will train an adapter on the SST-2 task using robert-base as the base model (as specified in the config file).

Besides modifying configuration keys directly in the json file, they can be overriden using command line parameters. E.g., we can modify the previous training run to fully fine-tune a bert-base-uncased model:

itrain --id <run_id> \
    --model_name_or_path bert-base-uncased \
    --train_adapter false \
    --learning_rate 3e-5 \
    --num_train_epochs 3 \
    --patience 0 \
    run_configs/<task>.json

Alternatively, training setups can be configured directly in Python by using the Setup class of itrain. An example for this is given in example.py.

Intermediate Task Transfer & Task Selection Experiments

Some scripts that helped running experiments presented in "What to Pre-Train on? Efficient Intermediate Task Selection" are provided:

  • See training_scripts for details on intermediate task transfer using sequential fine-tuning or adapter fusion
  • See task_selection for details on intermediate task selection methods.

All these scripts rely on pre-trained models/ adapters as described above and the following additional setup.

Setup

We used a configuration file to specify the pre-trained models/ adapters and tasks to be used as transfer sources and transfer targets for different task transfer strategies and task selection methods. The full configuration as used in the paper is given in task_map.json. It has to be modified to use self-trained models/ adapters:

  • from and to specify which tasks are used as transfer source and transfer targets (names as defined in run_configs)
  • source_path_format and target_path_format specify templates for the locations of pre-trained models/ adapters
  • adapters provides a mapping from pre-trained (source) models/ adapters to run ids

Finally, the path to this task map and the folder holding the run configurations have to be made available to the scripts:

export RUN_CONFIG_DIR="/path/to/run_configs"
export DEFAULT_TASK_MAP="/path/to/task_map.json"

Credits

Citation

If you find this repository helpful, please cite our paper "What to Pre-Train on? Efficient Intermediate Task Selection":

@inproceedings{poth-etal-2021-what-to-pre-train-on,
    title={What to Pre-Train on? Efficient Intermediate Task Selection},
    author={Clifton Poth and Jonas Pfeiffer and Andreas Rücklé and Iryna Gurevych},
    booktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP)",
    month = nov,
    year = "2021",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/2104.08247",
    pages = "to appear",
}
Owner
AdapterHub
AdapterHub
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022

DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism This repository is the official PyTorch implementation of our AAAI-2022 paper, in

Jinglin Liu 829 Jan 07, 2023
The swas programming language

The Swas programming language This is a language that was made for fun. Installation Step 0: Make sure you have python installed Step 1. Clone this re

Swas.py 19 Jul 18, 2022
Code for EMNLP'21 paper "Types of Out-of-Distribution Texts and How to Detect Them"

Code for EMNLP'21 paper "Types of Out-of-Distribution Texts and How to Detect Them"

Udit Arora 19 Oct 28, 2022
Application to help find best train itinerary, uses speech to text, has a spam filter to segregate invalid inputs, NLP and Pathfinding algos.

T-IAI-901-MSC2022 - GROUP 18 Gestion de projet Notre travail a été organisé et réparti dans un Trello. https://trello.com/b/X3s2fpPJ/ia-projet Install

1 Feb 05, 2022
Chinese NewsTitle Generation Project by GPT2.带有超级详细注释的中文GPT2新闻标题生成项目。

GPT2-NewsTitle 带有超详细注释的GPT2新闻标题生成项目 UpDate 01.02.2021 从网上收集数据,将清华新闻数据、搜狗新闻数据等新闻数据集,以及开源的一些摘要数据进行整理清洗,构建一个较完善的中文摘要数据集。 数据集清洗时,仅进行了简单地规则清洗。

logCong 785 Dec 29, 2022
Text preprocessing, representation and visualization from zero to hero.

Text preprocessing, representation and visualization from zero to hero. From zero to hero • Installation • Getting Started • Examples • API • FAQ • Co

Jonathan Besomi 2.7k Jan 08, 2023
Python-zhuyin - An open source Python library that provides a unified interface for converting between Chinese pinyin and Zhuyin (bopomofo)

Python-zhuyin - An open source Python library that provides a unified interface for converting between Chinese pinyin and Zhuyin (bopomofo)

2 Dec 29, 2022
Uses Google's gTTS module to easily create robo text readin' on command.

Tool to convert text to speech, creating files for later use. TTRS uses Google's gTTS module to easily create robo text readin' on command.

0 Jun 20, 2021
CDLA: A Chinese document layout analysis (CDLA) dataset

CDLA: A Chinese document layout analysis (CDLA) dataset 介绍 CDLA是一个中文文档版面分析数据集,面向中文文献类(论文)场景。包含以下10个label: 正文 标题 图片 图片标题 表格 表格标题 页眉 页脚 注释 公式 Text Title

buptlihang 84 Dec 28, 2022
Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.

textgenrnn Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code, or quickly tr

Max Woolf 4.8k Dec 30, 2022
This is the source code of RPG (Reward-Randomized Policy Gradient)

RPG (Reward-Randomized Policy Gradient) Zhenggang Tang*, Chao Yu*, Boyuan Chen, Huazhe Xu, Xiaolong Wang, Fei Fang, Simon Shaolei Du, Yu Wang, Yi Wu (

40 Nov 25, 2022
Implementation of Fast Transformer in Pytorch

Fast Transformer - Pytorch Implementation of Fast Transformer in Pytorch. This only work as an encoder. Yannic video AI Epiphany Install $ pip install

Phil Wang 167 Dec 27, 2022
scikit-learn wrappers for Python fastText.

skift scikit-learn wrappers for Python fastText. from skift import FirstColFtClassifier df = pandas.DataFrame([['woof', 0], ['meow', 1]], colu

Shay Palachy 233 Sep 09, 2022
Awesome-NLP-Research (ANLP)

Awesome-NLP-Research (ANLP)

Language, Information, and Learning at Yale 72 Dec 19, 2022
A fast hierarchical dimensionality reduction algorithm.

h-NNE: Hierarchical Nearest Neighbor Embedding A fast hierarchical dimensionality reduction algorithm. h-NNE is a general purpose dimensionality reduc

Marios Koulakis 35 Dec 12, 2022
Opal-lang - A WIP programming language based on Python

thanks to aphitorite for the beautiful logo! opal opal is a WIP transcompiled pr

3 Nov 04, 2022
Accurately generate all possible forms of an English word e.g "election" --> "elect", "electoral", "electorate" etc.

Accurately generate all possible forms of an English word Word forms can accurately generate all possible forms of an English word. It can conjugate v

Dibya Chakravorty 570 Dec 31, 2022
Constituency Tree Labeling Tool

Constituency Tree Labeling Tool The purpose of this package is to solve the constituency tree labeling problem. Look from the dataset labeled by NLTK,

张宇 6 Dec 20, 2022
Multi-Task Pre-Training for Plug-and-Play Task-Oriented Dialogue System

Multi-Task Pre-Training for Plug-and-Play Task-Oriented Dialogue System Authors: Yixuan Su, Lei Shu, Elman Mansimov, Arshit Gupta, Deng Cai, Yi-An Lai

Amazon Web Services - Labs 124 Jan 03, 2023
Türkçe küfürlü içerikleri bulan bir yapay zeka kütüphanesi / An ML library for profanity detection in Turkish sentences

"Kötü söz sahibine aittir." -Anonim Nedir? sinkaf uygunsuz yorumların bulunmasını sağlayan bir python kütüphanesidir. Farkı nedir? Diğer algoritmalard

KaraGoz 4 Feb 18, 2022