Source code for "FastBERT: a Self-distilling BERT with Adaptive Inference Time".

Overview

FastBERT

Source code for "FastBERT: a Self-distilling BERT with Adaptive Inference Time".

Good News

2021/10/29 - Code: Code of FastPLM is released on both Pypi and Github.

2021/09/08 - Paper: Journal version of FastBERT (FastPLM) is accepted by IEEE TNNLS. "An Empirical Study on Adaptive Inference for Pretrained Language Model".

2020/07/05 - Update: Pypi version of FastBERT has been launched. Please see fastbert-pypi.

Install fastbert with pip

$ pip install fastbert

Requirements

python >= 3.4.0, Install all the requirements with pip.

$ pip install -r requirements.txt

Quick start on the Chinese Book review dataset

Download the pre-trained Chinese BERT parameters from here, and save it to the models directory with the name of "Chinese_base_model.bin".

Run the following command to validate our FastBERT with Speed=0.5 on the Book review datasets.

$ CUDA_VISIBLE_DEVICES="0" python3 -u run_fastbert.py \
        --pretrained_model_path ./models/Chinese_base_model.bin \
        --vocab_path ./models/google_zh_vocab.txt \
        --train_path ./datasets/douban_book_review/train.tsv \
        --dev_path ./datasets/douban_book_review/dev.tsv \
        --test_path ./datasets/douban_book_review/test.tsv \
        --epochs_num 3 --batch_size 32 --distill_epochs_num 5 \
        --encoder bert --fast_mode --speed 0.5 \
        --output_model_path  ./models/douban_fastbert.bin

Meaning of each option.

usage: --pretrained_model_path Path to initialize model parameters.
       --vocab_path Path to the vocabulary.
       --train_path Path to the training dataset.
       --dev_path Path to the validating dataset.
       --test_path Path to the testing dataset.
       --epochs_num The epoch numbers of fine-tuning.
       --batch_size Batch size.
       --distill_epochs_num The epoch numbers of the self-distillation.
       --encoder The type of encoder.
       --fast_mode Whether to enable the fast mode of FastBERT.
       --speed The Speed value in the paper.
       --output_model_path Path to the output model parameters.

Test results on the Book review dataset.

Test results at fine-tuning epoch 3 (Baseline): Acc.=0.8688;  FLOPs=21785247744;
Test results at self-distillation epoch 1     : Acc.=0.8698;  FLOPs=6300902177;
Test results at self-distillation epoch 2     : Acc.=0.8691;  FLOPs=5844839008;
Test results at self-distillation epoch 3     : Acc.=0.8664;  FLOPs=5170940850;
Test results at self-distillation epoch 4     : Acc.=0.8664;  FLOPs=5170940327;
Test results at self-distillation epoch 5     : Acc.=0.8664;  FLOPs=5170940327;

Quick start on the English Ag.news dataset

Download the pre-trained English BERT parameters from here, and save it to the models directory with the name of "English_uncased_base_model.bin".

Download the ag_news.zip from here, and then unzip it to the datasets directory.

Run the following command to validate our FastBERT with Speed=0.5 on the Ag.news datasets.

$ CUDA_VISIBLE_DEVICES="0" python3 -u run_fastbert.py \
        --pretrained_model_path ./models/English_uncased_base_model.bin \
        --vocab_path ./models/google_uncased_en_vocab.txt \
        --train_path ./datasets/ag_news/train.tsv \
        --dev_path ./datasets/ag_news/test.tsv \
        --test_path ./datasets/ag_news/test.tsv \
        --epochs_num 3 --batch_size 32 --distill_epochs_num 5 \
        --encoder bert --fast_mode --speed 0.5 \
        --output_model_path  ./models/ag_news_fastbert.bin

Test results on the Ag.news dataset.

Test results at fine-tuning epoch 3 (Baseline): Acc.=0.9447;  FLOPs=21785247744;
Test results at self-distillation epoch 1     : Acc.=0.9308;  FLOPs=2172009009;
Test results at self-distillation epoch 2     : Acc.=0.9311;  FLOPs=2163471246;
Test results at self-distillation epoch 3     : Acc.=0.9314;  FLOPs=2108341649;
Test results at self-distillation epoch 4     : Acc.=0.9314;  FLOPs=2108341649;
Test results at self-distillation epoch 5     : Acc.=0.9314;  FLOPs=2108341649;

Datasets

More datasets can be downloaded from here.

Other implementations

There are some other excellent implementations of FastBERT.

Acknowledgement

This work is funded by 2019 Tencent Rhino-Bird Elite Training Program. Work done while this author was an intern at Tencent.

If you use this code, please cite this paper:

@inproceedings{weijie2020fastbert,
  title={{FastBERT}: a Self-distilling BERT with Adaptive Inference Time},
  author={Weijie Liu, Peng Zhou, Zhe Zhao, Zhiruo Wang, Haotang Deng, Qi Ju},
  booktitle={Proceedings of ACL 2020},
  year={2020}
}
Using fully convolutional networks for semantic segmentation with caffe for the cityscapes dataset

Using fully convolutional networks for semantic segmentation (Shelhamer et al.) with caffe for the cityscapes dataset How to get started Download the

Simon Guist 27 Jun 06, 2022
UAV-Networks-Routing is a Python simulator for experimenting routing algorithms and mac protocols on unmanned aerial vehicle networks.

UAV-Networks Simulator - Autonomous Networking - A.A. 20/21 UAV-Networks-Routing is a Python simulator for experimenting routing algorithms and mac pr

0 Nov 13, 2021
Pytorch version of SfmLearner from Tinghui Zhou et al.

SfMLearner Pytorch version This codebase implements the system described in the paper: Unsupervised Learning of Depth and Ego-Motion from Video Tinghu

Clément Pinard 909 Dec 22, 2022
This script runs neural style transfer against the provided content image.

Neural Style Transfer Content Style Output Description: This script runs neural style transfer against the provided content image. The content image m

Martynas Subonis 0 Nov 25, 2021
Face Transformer for Recognition

Face-Transformer This is the code of Face Transformer for Recognition (https://arxiv.org/abs/2103.14803v2). Recently there has been great interests of

Zhong Yaoyao 153 Nov 30, 2022
A denoising diffusion probabilistic model synthesises galaxies that are qualitatively and physically indistinguishable from the real thing.

Realistic galaxy simulation via score-based generative models Official code for 'Realistic galaxy simulation via score-based generative models'. We us

Michael Smith 32 Dec 20, 2022
Stochastic Tensor Optimization for Robot Motion - A GPU Robot Motion Toolkit

STORM Stochastic Tensor Optimization for Robot Motion - A GPU Robot Motion Toolkit [Install Instructions] [Paper] [Website] This package contains code

NVIDIA Research Projects 101 Dec 12, 2022
DEEPAGÉ: Answering Questions in Portuguese about the Brazilian Environment

DEEPAGÉ: Answering Questions in Portuguese about the Brazilian Environment This repository is related to the paper DEEPAGÉ: Answering Questions in Por

0 Dec 10, 2021
A Library for Modelling Probabilistic Hierarchical Graphical Models in PyTorch

A Library for Modelling Probabilistic Hierarchical Graphical Models in PyTorch

Korbinian Pöppel 47 Nov 28, 2022
Non-Imaging Transient Reconstruction And TEmporal Search (NITRATES)

Non-Imaging Transient Reconstruction And TEmporal Search (NITRATES) This repo contains the full NITRATES pipeline for maximum likelihood-driven discov

13 Nov 08, 2022
This is the repository of our article published on MDPI Entropy "Feature Selection for Recommender Systems with Quantum Computing".

Collaborative-driven Quantum Feature Selection This repository was developed by Riccardo Nembrini, PhD student at Politecnico di Milano. See the websi

Quantum Computing Lab @ Politecnico di Milano 10 Apr 21, 2022
Code for the paper "Implicit Representations of Meaning in Neural Language Models"

Implicit Representations of Meaning in Neural Language Models Preliminaries Create and set up a conda environment as follows: conda create -n state-pr

Belinda Li 39 Nov 03, 2022
Torchlight2 lan game server tool - A message forwarding tool for Torchlight 2 lan game

Torchlight 2 Lan Game Server Tool A message forwarding tool for Torchlight 2 lan

Huaijun Jiang 3 Nov 01, 2022
⚾🤖⚾ Automatic baseball pitching overlay in realtime

⚾ Automatically overlaying pitch motion and trajectory with machine learning! This project takes your baseball pitching clips and automatically genera

Tony Chou 240 Dec 05, 2022
[Machine Learning Engineer Basic Guide] 부스트캠프 AI Tech - Product Serving 자료

Boostcamp-AI-Tech-Product-Serving 부스트캠프 AI Tech - Product Serving 자료 Repository 구조 part1(MLOps 개론, Model Serving, 머신러닝 프로젝트 라이프 사이클은 별도의 코드가 없으며, part

Sung Yun Byeon 269 Dec 21, 2022
U-Time: A Fully Convolutional Network for Time Series Segmentation

U-Time & U-Sleep Official implementation of The U-Time [1] model for general-purpose time-series segmentation. The U-Sleep [2] model for resilient hig

Mathias Perslev 176 Dec 19, 2022
Pytorch code for "State-only Imitation with Transition Dynamics Mismatch" (ICLR 2020)

This repo contains code for our paper State-only Imitation with Transition Dynamics Mismatch published at ICLR 2020. The code heavily uses the RL mach

20 Sep 08, 2022
A python library to artfully visualize Factorio Blueprints and an interactive web demo for using it.

Factorio Blueprint Visualizer I love the game Factorio and I really like the look of factories after growing for many hours or blueprints after tweaki

Piet Brömmel 124 Jan 07, 2023
RealFormer-Pytorch Implementation of RealFormer using pytorch

RealFormer-Pytorch Implementation of RealFormer using pytorch. Includes comparison with classical Transformer on image classification task (ViT) wrt C

Simo Ryu 90 Dec 08, 2022
Pytorch implementation of paper "Efficient Nearest Neighbor Language Models" (EMNLP 2021)

Pytorch implementation of paper "Efficient Nearest Neighbor Language Models" (EMNLP 2021)

Junxian He 57 Jan 01, 2023