Research code for "What to Pre-Train on? Efficient Intermediate Task Selection", EMNLP 2021

Overview

efficient-task-transfer

This repository contains code for the experiments in our paper "What to Pre-Train on? Efficient Intermediate Task Selection". Most importantly, this includes scripts for easy training of Transformers and Adapters across a wide range of NLU tasks.

Overview

The repository is structured as follows:

  • itrain holds the itrain package which allows easy setup, training and evaluation of Transformers and Adapters
  • run_configs provides default training configuration of all tasks currently supported by itrain
  • training_scripts provides scripts for sequential adapter fine-tuning and adapter fusion as used in the paper
  • task_selection provides scripts used for intermediate task selection in the paper

Setup & Requirements

The code in this repository was developed using Python v3.6.8, PyTorch v1.7.1 and adapter-transformers v1.1.1, which is based on HuggingFace Transformers v3.5.1. Using version different from the ones specified might not work.

After setting up Python and PyTorch (ideally in a virtual environment), all additional requirements together with the itrain package can be installed using:

pip install -e .

Additional setup steps required for running some scripts are detailed below locations.

Transformer & Adapter Training

The itrain package provides a simple interface for configuring Transformer and Adapter training runs. itrain provides tools for:

  • downloading and preprocessing datasets via HuggingFace datasets
  • setting up Transformers and Adapter training
  • training and evaluating on different tasks
  • notifying on training start and results via mail or Telegram

itrain can be invoked from the command line by passing a run configuration file in json format. Example configurations for all currently supported tasks can be found in the run_configs folder. All supported configuration keys are defined in arguments.py.

Running a setup from the command line can look like this:

itrain --id 42 run_configs/sst2.json

This will train an adapter on the SST-2 task using robert-base as the base model (as specified in the config file).

Besides modifying configuration keys directly in the json file, they can be overriden using command line parameters. E.g., we can modify the previous training run to fully fine-tune a bert-base-uncased model:

itrain --id <run_id> \
    --model_name_or_path bert-base-uncased \
    --train_adapter false \
    --learning_rate 3e-5 \
    --num_train_epochs 3 \
    --patience 0 \
    run_configs/<task>.json

Alternatively, training setups can be configured directly in Python by using the Setup class of itrain. An example for this is given in example.py.

Intermediate Task Transfer & Task Selection Experiments

Some scripts that helped running experiments presented in "What to Pre-Train on? Efficient Intermediate Task Selection" are provided:

  • See training_scripts for details on intermediate task transfer using sequential fine-tuning or adapter fusion
  • See task_selection for details on intermediate task selection methods.

All these scripts rely on pre-trained models/ adapters as described above and the following additional setup.

Setup

We used a configuration file to specify the pre-trained models/ adapters and tasks to be used as transfer sources and transfer targets for different task transfer strategies and task selection methods. The full configuration as used in the paper is given in task_map.json. It has to be modified to use self-trained models/ adapters:

  • from and to specify which tasks are used as transfer source and transfer targets (names as defined in run_configs)
  • source_path_format and target_path_format specify templates for the locations of pre-trained models/ adapters
  • adapters provides a mapping from pre-trained (source) models/ adapters to run ids

Finally, the path to this task map and the folder holding the run configurations have to be made available to the scripts:

export RUN_CONFIG_DIR="/path/to/run_configs"
export DEFAULT_TASK_MAP="/path/to/task_map.json"

Credits

Citation

If you find this repository helpful, please cite our paper "What to Pre-Train on? Efficient Intermediate Task Selection":

@inproceedings{poth-etal-2021-what-to-pre-train-on,
    title={What to Pre-Train on? Efficient Intermediate Task Selection},
    author={Clifton Poth and Jonas Pfeiffer and Andreas Rücklé and Iryna Gurevych},
    booktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP)",
    month = nov,
    year = "2021",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/2104.08247",
    pages = "to appear",
}
Owner
AdapterHub
AdapterHub
MPNet: Masked and Permuted Pre-training for Language Understanding

MPNet MPNet: Masked and Permuted Pre-training for Language Understanding, by Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-Yan Liu, is a novel pre-tr

Microsoft 228 Nov 21, 2022
Recognition of 38 speech commands in russian. Based on Yandex Cup 2021 ML Challenge: ASR

Speech_38_ru_commands Recognition of 38 speech commands in russian. Based on Yandex Cup 2021 ML Challenge: ASR Программа умеет распознавать 38 ключевы

Andrey 9 May 05, 2022
This is a GUI program that will generate a word search puzzle image

Word Search Puzzle Generator Table of Contents About The Project Built With Getting Started Prerequisites Installation Usage Roadmap Contributing Cont

11 Feb 22, 2022
NLP-Project - Used an API to scrape 2000 reddit posts, then used NLP analysis and created a classification model to mixed succcess

Project 3: Web APIs & NLP Problem Statement How do r/Libertarian and r/Neoliberal differ on Biden post-inaguration? The goal of the project is to see

Adam Muhammad Klesc 2 Mar 29, 2022
SummerTime - Text Summarization Toolkit for Non-experts

A library to help users choose appropriate summarization tools based on their specific tasks or needs. Includes models, evaluation metrics, and datasets.

Yale-LILY 213 Jan 04, 2023
Implementation for paper BLEU: a Method for Automatic Evaluation of Machine Translation

BLEU Score Implementation for paper: BLEU: a Method for Automatic Evaluation of Machine Translation Author: Ba Ngoc from ProtonX BLEU score is a popul

Ngoc Nguyen Ba 6 Oct 07, 2021
HAIS_2GNN: 3D Visual Grounding with Graph and Attention

HAIS_2GNN: 3D Visual Grounding with Graph and Attention This repository is for the HAIS_2GNN research project. Tao Gu, Yue Chen Introduction The motiv

Yue Chen 1 Nov 26, 2022
Visual Automata is a Python 3 library built as a wrapper for Caleb Evans' Automata library to add more visualization features.

Visual Automata Copyright 2021 Lewi Lie Uberg Released under the MIT license Visual Automata is a Python 3 library built as a wrapper for Caleb Evans'

Lewi Uberg 55 Nov 17, 2022
jiant is an NLP toolkit

🚨 Update 🚨 : As of 2021/10/17, the jiant project is no longer being actively maintained. This means there will be no plans to add new models, tasks,

ML² AT CILVR 1.5k Dec 28, 2022
An ultra fast tiny model for lane detection, using onnx_parser, TensorRTAPI, torch2trt to accelerate. our model support for int8, dynamic input and profiling. (Nvidia-Alibaba-TensoRT-hackathon2021)

Ultra_Fast_Lane_Detection_TensorRT An ultra fast tiny model for lane detection, using onnx_parser, TensorRTAPI to accelerate. our model support for in

steven.yan 121 Dec 27, 2022
KoBART model on huggingface transformers

KoBART-Transformers SKT에서 공개한 KoBART를 편리하게 사용할 수 있게 transformers로 포팅하였습니다. Install (Optional) BartModel과 PreTrainedTokenizerFast를 이용하면 설치하실 필요 없습니다. p

Hyunwoong Ko 58 Dec 07, 2022
A simple command line tool for text to image generation, using OpenAI's CLIP and a BigGAN

artificial intelligence cosmic love and attention fire in the sky a pyramid made of ice a lonely house in the woods marriage in the mountains lantern

Phil Wang 2.3k Jan 01, 2023
Sequence-to-Sequence Framework in PyTorch

nmtpytorch allows training of various end-to-end neural architectures including but not limited to neural machine translation, image captioning and au

LIUM 395 Nov 21, 2022
BERTopic is a topic modeling technique that leverages 🤗 transformers and c-TF-IDF to create dense clusters allowing for easily interpretable topics whilst keeping important words in the topic descriptions

BERTopic BERTopic is a topic modeling technique that leverages 🤗 transformers and c-TF-IDF to create dense clusters allowing for easily interpretable

Maarten Grootendorst 3.6k Jan 07, 2023
Python package to easily retrain OpenAI's GPT-2 text-generating model on new texts

gpt-2-simple A simple Python package that wraps existing model fine-tuning and generation scripts for OpenAI's GPT-2 text generation model (specifical

Max Woolf 3.1k Jan 07, 2023
This is a really simple text-to-speech app made with python and tkinter.

Tkinter Text-to-Speech App by Souvik Roy This is a really simple tkinter app which converts the text you have entered into a speech. It is created wit

Souvik Roy 1 Dec 21, 2021
CCF BDCI 2020 房产行业聊天问答匹配赛道 A榜47/2985

CCF BDCI 2020 房产行业聊天问答匹配 A榜47/2985 赛题描述详见:https://www.datafountain.cn/competitions/474 文件说明 data: 存放训练数据和测试数据以及预处理代码 model_bert.py: 网络模型结构定义 adv_train

shuo 40 Sep 28, 2022
Generate text line images for training deep learning OCR model (e.g. CRNN)

Generate text line images for training deep learning OCR model (e.g. CRNN)

532 Jan 06, 2023
An easier way to build neural search on the cloud

An easier way to build neural search on the cloud Jina is a deep learning-powered search framework for building cross-/multi-modal search systems (e.g

Jina AI 17.1k Jan 09, 2023
Multilingual text (NLP) processing toolkit

polyglot Polyglot is a natural language pipeline that supports massive multilingual applications. Free software: GPLv3 license Documentation: http://p

RAMI ALRFOU 2.1k Jan 07, 2023