Research code for "What to Pre-Train on? Efficient Intermediate Task Selection", EMNLP 2021

Overview

efficient-task-transfer

This repository contains code for the experiments in our paper "What to Pre-Train on? Efficient Intermediate Task Selection". Most importantly, this includes scripts for easy training of Transformers and Adapters across a wide range of NLU tasks.

Overview

The repository is structured as follows:

  • itrain holds the itrain package which allows easy setup, training and evaluation of Transformers and Adapters
  • run_configs provides default training configuration of all tasks currently supported by itrain
  • training_scripts provides scripts for sequential adapter fine-tuning and adapter fusion as used in the paper
  • task_selection provides scripts used for intermediate task selection in the paper

Setup & Requirements

The code in this repository was developed using Python v3.6.8, PyTorch v1.7.1 and adapter-transformers v1.1.1, which is based on HuggingFace Transformers v3.5.1. Using version different from the ones specified might not work.

After setting up Python and PyTorch (ideally in a virtual environment), all additional requirements together with the itrain package can be installed using:

pip install -e .

Additional setup steps required for running some scripts are detailed below locations.

Transformer & Adapter Training

The itrain package provides a simple interface for configuring Transformer and Adapter training runs. itrain provides tools for:

  • downloading and preprocessing datasets via HuggingFace datasets
  • setting up Transformers and Adapter training
  • training and evaluating on different tasks
  • notifying on training start and results via mail or Telegram

itrain can be invoked from the command line by passing a run configuration file in json format. Example configurations for all currently supported tasks can be found in the run_configs folder. All supported configuration keys are defined in arguments.py.

Running a setup from the command line can look like this:

itrain --id 42 run_configs/sst2.json

This will train an adapter on the SST-2 task using robert-base as the base model (as specified in the config file).

Besides modifying configuration keys directly in the json file, they can be overriden using command line parameters. E.g., we can modify the previous training run to fully fine-tune a bert-base-uncased model:

itrain --id <run_id> \
    --model_name_or_path bert-base-uncased \
    --train_adapter false \
    --learning_rate 3e-5 \
    --num_train_epochs 3 \
    --patience 0 \
    run_configs/<task>.json

Alternatively, training setups can be configured directly in Python by using the Setup class of itrain. An example for this is given in example.py.

Intermediate Task Transfer & Task Selection Experiments

Some scripts that helped running experiments presented in "What to Pre-Train on? Efficient Intermediate Task Selection" are provided:

  • See training_scripts for details on intermediate task transfer using sequential fine-tuning or adapter fusion
  • See task_selection for details on intermediate task selection methods.

All these scripts rely on pre-trained models/ adapters as described above and the following additional setup.

Setup

We used a configuration file to specify the pre-trained models/ adapters and tasks to be used as transfer sources and transfer targets for different task transfer strategies and task selection methods. The full configuration as used in the paper is given in task_map.json. It has to be modified to use self-trained models/ adapters:

  • from and to specify which tasks are used as transfer source and transfer targets (names as defined in run_configs)
  • source_path_format and target_path_format specify templates for the locations of pre-trained models/ adapters
  • adapters provides a mapping from pre-trained (source) models/ adapters to run ids

Finally, the path to this task map and the folder holding the run configurations have to be made available to the scripts:

export RUN_CONFIG_DIR="/path/to/run_configs"
export DEFAULT_TASK_MAP="/path/to/task_map.json"

Credits

Citation

If you find this repository helpful, please cite our paper "What to Pre-Train on? Efficient Intermediate Task Selection":

@inproceedings{poth-etal-2021-what-to-pre-train-on,
    title={What to Pre-Train on? Efficient Intermediate Task Selection},
    author={Clifton Poth and Jonas Pfeiffer and Andreas Rücklé and Iryna Gurevych},
    booktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP)",
    month = nov,
    year = "2021",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/2104.08247",
    pages = "to appear",
}
Owner
AdapterHub
AdapterHub
Large-scale Knowledge Graph Construction with Prompting

Large-scale Knowledge Graph Construction with Prompting across tasks (predictive and generative), and modalities (language, image, vision + language, etc.)

ZJUNLP 161 Dec 28, 2022
Graph4nlp is the library for the easy use of Graph Neural Networks for NLP

Graph4NLP Graph4NLP is an easy-to-use library for R&D at the intersection of Deep Learning on Graphs and Natural Language Processing (i.e., DLG4NLP).

Graph4AI 1.5k Dec 23, 2022
A repository to run gpt-j-6b on low vram machines (4.2 gb minimum vram for 2000 token context, 3.5 gb for 1000 token context). Model loading takes 12gb free ram.

Basic-UI-for-GPT-J-6B-with-low-vram A repository to run GPT-J-6B on low vram systems by using both ram, vram and pinned memory. There seem to be some

90 Dec 25, 2022
A framework for evaluating Knowledge Graph Embedding Models in a fine-grained manner.

A framework for evaluating Knowledge Graph Embedding Models in a fine-grained manner.

NEC Laboratories Europe 13 Sep 08, 2022
Pytorch NLP library based on FastAI

Quick NLP Quick NLP is a deep learning nlp library inspired by the fast.ai library It follows the same api as fastai and extends it allowing for quick

Agis pof 283 Nov 21, 2022
Creating an LSTM model to generate music

Music-Generation Creating an LSTM model to generate music music-generator Used to create basic sin wave sounds music-ai Contains the functions to conv

Jerin Joseph 2 Dec 02, 2021
Index different CKAN entities in Solr, not just datasets

ckanext-sitesearch Index different CKAN entities in Solr, not just datasets Requirements This extension requires CKAN 2.9 or higher and Python 3 Featu

Open Knowledge Foundation 3 Dec 02, 2022
Python code for ICLR 2022 spotlight paper EViT: Expediting Vision Transformers via Token Reorganizations

Expediting Vision Transformers via Token Reorganizations This repository contain

Youwei Liang 101 Dec 26, 2022
Pytorch code for ICRA'21 paper: "Hierarchical Cross-Modal Agent for Robotics Vision-and-Language Navigation"

Hierarchical Cross-Modal Agent for Robotics Vision-and-Language Navigation This repository is the pytorch implementation of our paper: Hierarchical Cr

44 Jan 06, 2023
An open source framework for seq2seq models in PyTorch.

pytorch-seq2seq Documentation This is a framework for sequence-to-sequence (seq2seq) models implemented in PyTorch. The framework has modularized and

International Business Machines 1.4k Jan 02, 2023
A Japanese tokenizer based on recurrent neural networks

Nagisa is a python module for Japanese word segmentation/POS-tagging. It is designed to be a simple and easy-to-use tool. This tool has the following

325 Jan 05, 2023
Club chatbot

Chatbot Club chatbot Instructions to get the Chatterbot working Step 1. First make sure you are using a version of Python 3 or newer. To check your ve

5 Mar 07, 2022
Deploying a Text Summarization NLP use case on Docker Container Utilizing Nvidia GPU

GPU Docker NLP Application Deployment Deploying a Text Summarization NLP use case on Docker Container Utilizing Nvidia GPU, to setup the enviroment on

Ritesh Yadav 9 Oct 14, 2022
A NLP program: tokenize method, PoS Tagging with deep learning

IRIS NLP SYSTEM A NLP program: tokenize method, PoS Tagging with deep learning Report Bug · Request Feature Table of Contents About The Project Built

Zakaria 7 Dec 13, 2022
Learn meanings behind words is a key element in NLP. This project concentrates on the disambiguation of preposition senses. Therefore, we train a bert-transformer model and surpass the state-of-the-art.

New State-of-the-Art in Preposition Sense Disambiguation Supervisor: Prof. Dr. Alexander Mehler Alexander Henlein Institutions: Goethe University TTLa

Dirk Neuhäuser 4 Apr 06, 2022
Tensorflow implementation of paper: Learning to Diagnose with LSTM Recurrent Neural Networks.

Multilabel time series classification with LSTM Tensorflow implementation of model discussed in the following paper: Learning to Diagnose with LSTM Re

Aaqib 552 Nov 28, 2022
Rootski - Full codebase for rootski.io (without the data)

📣 Welcome to the Rootski codebase! This is the codebase for the application run

Eric 20 Nov 18, 2022
Script to download some free japanese lessons in portuguse from NHK

Nihongo_nhk This is a script to download some free japanese lessons in portuguese from NHK. It can be executed by installing the packages with: pip in

Matheus Alves 2 Jan 06, 2022
Rich Prosody Diversity Modelling with Phone-level Mixture Density Network

Phone Level Mixture Density Network for TTS This repo contains pytorch implementation of paper Rich Prosody Diversity Modelling with Phone-level Mixtu

Rishikesh (ऋषिकेश) 42 Dec 13, 2022
Implementation of Fast Transformer in Pytorch

Fast Transformer - Pytorch Implementation of Fast Transformer in Pytorch. This only work as an encoder. Yannic video AI Epiphany Install $ pip install

Phil Wang 167 Dec 27, 2022