PyTorch implementation of the ACL, 2021 paper Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks.

Last update: Dec 28, 2022

Related tags

Deep Learning hyperformer

Overview

Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks

This repo contains the PyTorch implementation of the ACL, 2021 paper Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks.

Installation

python setup.py install

How to run the models

We provide example scripts for each model in hyperformer/scripts/ folder with their config files in hyperformer/configs. To run the models, please do cd hyperformer and:

To run hyperformer++ model (This model generates the task-specific adapters using a shared hypernetwork, which is shared across the tasks and layers of a transformer.):
```
bash scripts/hyperformer++.sh
```
To run hyperformer model (This model generates the task-specific adapters using a shared hypernetwork, which is shared across the tasks, but this is specific to each layer of a transformer. This model is less efficient compared to hyperformer++.):
```
bash scripts/hyperformer.sh
```
To run adapter\dagger model (This model share the layer normalization between adapters across the tasks, and train adapters in a multi-task setting.):
```
bash scripts/adapters_dagger.sh   
```
To run adapter model (This model trains a single-adapter per task and trains the adapters in a single-task learning.):
```
bash scripts/adapters.sh 
```
To run T5 finetuning model in a multi-task learning setup:
```
bash scripts/finetune.sh
```
To run T5 finetuning model in a single-task learning setup:
```
bash scripts/finetune_single_task.sh
```

We run all the models on 4 GPUs, while this is not necessary and one can run the models on 1 GPU. In case running on one GPU, in all the scripts, please remove the -m torch.distributed.launch --nproc_per_node=4 part.

Bibliography

If you find this repo useful, please cite our paper.

@inproceedings{karimi2021parameterefficient,
  title={Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks},
  author={Karimi Mahabadi, Rabeeh and Ruder, Sebastian and Dehghani, Mostafa and Henderson, James},
  booktitle={Annual Meeting of the Association for Computational Linguistics},
  year={2021}
}

Final words

Hope this repo is useful for your research. For any questions, please create an issue or email [email protected], and I will get back to you as soon as possible.

PyTorch implementation of the ACL, 2021 paper Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks.

Related tags

Overview

Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks

Installation

How to run the models

Bibliography

Final words

Owner

Rabeeh Karimi Mahabadi

Unsupervised Foreground Extraction via Deep Region Competition

Pytorch code for our paper Beyond ImageNet Attack: Towards Crafting Adversarial Examples for Black-box Domains)

Contrastive Language-Image Pretraining

Attention-guided gan for synthesizing IR images

Revisting Open World Object Detection

Iris prediction model is used to classify iris species created julia's DecisionTree, DataFrames, JLD2, PlotlyJS and Statistics packages.

PyTorch Code for "Generalization in Dexterous Manipulation via Geometry-Aware Multi-Task Learning"

Axel - 3D printed robotic hands and they controll with Raspberry Pi and Arduino combo

Official implementation for "Image Quality Assessment using Contrastive Learning"

A symbolic-model-guided fuzzer for TLS

Semi-SDP Semi-supervised parser for semantic dependency parsing.

This code provides various models combining dilated convolutions with residual networks

Zsseg.baseline - Zero-Shot Semantic Segmentation

Our solution for SSN Invente 2021's Hackathon

Implementation of parameterized soft-exponential activation function.

Contains source code for the winning solution of the xView3 challenge

This is a beginner-friendly repo to make a collection of some unique and awesome projects. Everyone in the community can benefit & get inspired by the amazing projects present over here.

An official TensorFlow implementation of “CLCC: Contrastive Learning for Color Constancy” accepted at CVPR 2021.

Match SafeGraph POIs with Data collected through a cultural resource survey in Washington DC.

A Lighting Pytorch Framework for Recommendation System, Easy-to-use and Easy-to-extend.