Adapter-BERT: Parameter-Efficient Transfer Learning for NLP.

Overview

Adapter-BERT

Introduction

This repository contains a version of BERT that can be trained using adapters. Our ICML 2019 paper contains a full description of this technique: Parameter-Efficient Transfer Learning for NLP.

Adapters allow one to train a model to solve new tasks, but adjust only a few parameters per task. This technique yields compact models that share many parameters across tasks, whilst performing similarly to fine-tuning the entire model independently for every task.

The code here is forked from the original BERT repo. It provides our version of BERT with adapters, and the capability to train it on the GLUE tasks. For additional details on BERT, and support for additional tasks, see the original repo.

Tuning BERT with Adapters

The following command provides an example of tuning with adapters on GLUE.

Fine-tuning may be run on a GPU with at least 12GB of RAM, or a Cloud TPU. The same constraints apply as for full fine-tuning of BERT. For additional details, and instructions on downloading a pre-trained checkpoint and the GLUE tasks, see https://github.com/google-research/bert.

export BERT_BASE_DIR=/path/to/bert/uncased_L-12_H-768_A-12
export GLUE_DIR=/path/to/glue

python run_classifier.py \
  --task_name=MRPC \
  --do_train=true \
  --do_eval=true \
  --data_dir=$GLUE_DIR/MRPC \
  --vocab_file=$BERT_BASE_DIR/vocab.txt \
  --bert_config_file=$BERT_BASE_DIR/bert_config.json \
  --init_checkpoint=$BERT_BASE_DIR/bert_model.ckpt \
  --max_seq_length=128 \
  --train_batch_size=32 \
  --learning_rate=3e-4 \
  --num_train_epochs=5.0 \
  --output_dir=/tmp/adapter_bert_mrpc/

You should see an output like this:

***** Eval results *****
  eval_accuracy = 0.85784316
  eval_loss = 0.48347527
  global_step = 573
  loss = 0.48347527

This means that the Dev set accuracy was 85.78%. Small sets like MRPC have a high variance in the Dev set accuracy, even when starting from the same pre-training checkpoint. Therefore results may deviate from this by 2%.

Citation

Please use the following citation for this work:

@inproceedings{houlsby2019parameter,
  title = {Parameter-Efficient Transfer Learning for {NLP}},
  author = {Houlsby, Neil and Giurgiu, Andrei and Jastrzebski, Stanislaw and Morrone, Bruna and De Laroussilhe, Quentin and Gesmundo, Andrea and Attariyan, Mona and Gelly, Sylvain},
  booktitle = {Proceedings of the 36th International Conference on Machine Learning},
  year = {2019},
}

The paper is uploaded to ArXiv.

Disclaimer

This is not an official Google product.

Contact information

For personal communication, please contact Neil Houlsby ([email protected]).

Owner
Google Research
Google Research
Model Zoo of BDD100K Dataset

Model Zoo of BDD100K Dataset

ETH VIS Group 200 Dec 27, 2022
Simulation of the solar system using various nummerical methods

solar-system Simulation of the solar system using various nummerical methods Download the repo Make shure matplotlib, scipy etc. are installed execute

Caspar 7 Jul 15, 2022
Segmentation-Aware Convolutional Networks Using Local Attention Masks

Segmentation-Aware Convolutional Networks Using Local Attention Masks [Project Page] [Paper] Segmentation-aware convolution filters are invariant to b

144 Jun 29, 2022
IEGAN — Official PyTorch Implementation Independent Encoder for Deep Hierarchical Unsupervised Image-to-Image Translation

IEGAN — Official PyTorch Implementation Independent Encoder for Deep Hierarchical Unsupervised Image-to-Image Translation Independent Encoder for Deep

30 Nov 05, 2022
Deep Occlusion-Aware Instance Segmentation with Overlapping BiLayers [CVPR 2021]

Deep Occlusion-Aware Instance Segmentation with Overlapping BiLayers [BCNet, CVPR 2021] This is the official pytorch implementation of BCNet built on

Lei Ke 434 Dec 01, 2022
Nicely is a real-time Feedback and Intervention Program Depression is a prevalent issue across all age groups, socioeconomic classes, and cultural identities.

Nicely is a real-time Feedback and Intervention Program Depression is a prevalent issue across all age groups, socioeconomic classes, and cultural identities.

1 Jan 16, 2022
Scalable implementation of Lee / Mykland (2012) and Ait-Sahalia / Jacod (2012) Jump tests for noisy high frequency data

JumpDetectR Name of QuantLet : JumpDetectR Published in : 'To be published as "Jump dynamics in high frequency crypto markets"' Description : 'Scala

LvB 12 Jan 01, 2023
Extreme Lightwegith Portrait Segmentation

Extreme Lightwegith Portrait Segmentation Please go to this link to download code Requirements python 3 pytorch = 0.4.1 torchvision==0.2.1 opencv-pyt

HYOJINPARK 59 Dec 16, 2022
《Rethinking Sptil Dimensions of Vision Trnsformers》(2021)

Rethinking Spatial Dimensions of Vision Transformers Byeongho Heo, Sangdoo Yun, Dongyoon Han, Sanghyuk Chun, Junsuk Choe, Seong Joon Oh | Paper NAVER

NAVER AI 224 Dec 27, 2022
Official implementation of "Generating 3D Molecules for Target Protein Binding"

Generating 3D Molecules for Target Protein Binding This is the official implementation of the GraphBP method proposed in the following paper. Meng Liu

DIVE Lab, Texas A&M University 74 Dec 07, 2022
Code for ICLR 2020 paper "VL-BERT: Pre-training of Generic Visual-Linguistic Representations".

VL-BERT By Weijie Su, Xizhou Zhu, Yue Cao, Bin Li, Lewei Lu, Furu Wei, Jifeng Dai. This repository is an official implementation of the paper VL-BERT:

Weijie Su 698 Dec 18, 2022
[BMVC 2021] Official PyTorch Implementation of Self-supervised learning of Image Scale and Orientation Estimation

Self-Supervised Learning of Image Scale and Orientation Estimation (BMVC 2021) This is the official implementation of the paper "Self-Supervised Learn

Jongmin Lee 17 Nov 10, 2022
Official implementation for the paper "SAPE: Spatially-Adaptive Progressive Encoding for Neural Optimization".

SAPE Project page Paper Official implementation for the paper "SAPE: Spatially-Adaptive Progressive Encoding for Neural Optimization". Environment Cre

36 Dec 09, 2022
Doosan robotic arm, simulation, control, visualization in Gazebo and ROS2 for Reinforcement Learning.

Robotic Arm Simulation in ROS2 and Gazebo General Overview This repository includes: First, how to simulate a 6DoF Robotic Arm from scratch using GAZE

David Valencia 12 Jan 02, 2023
Re-implement CycleGAN in Tensorlayer

CycleGAN_Tensorlayer Re-implement CycleGAN in TensorLayer Original CycleGAN Improved CycleGAN with resize-convolution Prerequisites: TensorLayer Tenso

89 Aug 15, 2022
Multiwavelets-based operator model

Multiwavelet model for Operator maps Gaurav Gupta, Xiongye Xiao, and Paul Bogdan Multiwavelet-based Operator Learning for Differential Equations In Ne

Gaurav 33 Dec 04, 2022
A general framework for inferring CNNs efficiently. Reduce the inference latency of MobileNet-V3 by 1.3x on an iPhone XS Max without sacrificing accuracy.

GFNet-Pytorch (NeurIPS 2020) This repo contains the official code and pre-trained models for the glance and focus network (GFNet). Glance and Focus: a

Rainforest Wang 169 Oct 28, 2022
PyTorch implementation of VAGAN: Visual Feature Attribution Using Wasserstein GANs

Prototypical Networks for Few shot Learning in PyTorch Simple alternative Implementation of Prototypical Networks for Few Shot Learning (paper, code)

Orobix 93 Aug 17, 2022
Adds timm pretrained backbone to pytorch's FasterRcnn model

Operating Systems Lab (ETCS-352) Experiments for Operating Systems Lab (ETCS-352) performed by me in 2021 at uni. All codes are written by me except t

Mriganka Nath 12 Dec 03, 2022
An Open-Source Tool for Automatic Disease Diagnosis..

OpenMedicalChatbox An Open-Source Package for Automatic Disease Diagnosis. Overview Due to the lack of open source for existing RL-base automated diag

8 Nov 08, 2022