The official implementation of You Only Compress Once: Towards Effective and Elastic BERT Compression via Exploit-Explore Stochastic Nature Gradient.

Last update: Dec 07, 2022

Related tags

Deep Learning YOCO-BERT

Overview

You Only Compress Once: Towards Effective and Elastic BERT Compression via Exploit-Explore Stochastic Nature Gradient (paper)

@misc{zhang2021compress,
      title={You Only Compress Once: Towards Effective and Elastic BERT Compression via Exploit-Explore Stochastic Nature Gradient}, 
      author={Shaokun Zhang and Xiawu Zheng and Chenyi Yang and Yuchao Li and Yan Wang and Fei Chao and Mengdi Wang and Shen Li and Jun Yang and Rongrong Ji},
      year={2021},
      eprint={2106.02435},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
      }

Overview

This repository is the official implementation of You Only Compress Once: Towards Effective and Elastic BERT Compression via Exploit-Explore Stochastic Nature Gradient

📋 We propose a novel approach, YOCO-BERT, to achieve compress once and deploy everywhere. Compared with state of-the-art algorithms, YOCO-BERT provides more compact models, yet achieving superior average accuracy improvement on the GLUE.

Requirements

Python > 3.6
Pytorch = 1.7.0
transformers = 3.5.0

Training

To train the super-BERTs in the paper, run this command:

python train_superbert.py --cfg /path_to_superbert_training_config/config.yaml

Searching

To search the optimal sub-BERTs given any constraints in the paper, run this command:

python search_subbert.py --cfg /path_to_subbert_searching_config/config.yaml

Evaluation

The evaluation results will be reported after the searching process.

Config

We release all the traning and searching configs in config

Results

Our model achieves the following performance on :

GLUE

Results given various FlOPs and parameters.

Results under common constraints (compress to no more than 66M)

Datasets	SST-2	MRPC	CoLA	RTE	MNLI	QQP	QNLI
Results	92.8	90.3	59.8	72.9	82.6	90.5	87.2

📋 The detailed metrics used in this code are reported in the paper.

Licence

This repository is released under the MIT license. See LICENSE for more information.

Contact

Any problem regarding this code re-implementation, feel free to contact the first author: [email protected]

The official implementation of You Only Compress Once: Towards Effective and Elastic BERT Compression via Exploit-Explore Stochastic Nature Gradient.

Related tags

Overview

You Only Compress Once: Towards Effective and Elastic BERT Compression via Exploit-Explore Stochastic Nature Gradient (paper)

Overview

Requirements

Training

Searching

Evaluation

Config

Results

GLUE

Results given various FlOPs and parameters.

Results under common constraints (compress to no more than 66M)

Licence

Contact

Owner

OpenCV, MediaPipe Pose Estimation, Affine Transform for Icon Overlay

This is the official implementation of our proposed SwinMR

An end-to-end image translation model with weight-map for color constancy

Python project to take sound as input and output as RGB + Brightness values suitable for DMX

Tensorflow-Project-Template - A best practice for tensorflow project template architecture.

PyTorch evaluation code for Delving Deep into the Generalization of Vision Transformers under Distribution Shifts.

Official Tensorflow implementation of "M-LSD: Towards Light-weight and Real-time Line Segment Detection"

PyTorch implementation of Lip to Speech Synthesis with Visual Context Attentional GAN (NeurIPS2021)

Official pytorch implementation of the paper: "SinGAN: Learning a Generative Model from a Single Natural Image"

StackRec: Efficient Training of Very Deep Sequential Recommender Models by Iterative Stacking

This repository provides an efficient PyTorch-based library for training deep models.

Quantile Regression DQN a Minimal Working Example, Distributional Reinforcement Learning with Quantile Regression

This is the repo for Uncertainty Quantification 360 Toolkit.

Image reconstruction done with untrained neural networks.

Official Pytorch implementation for Deep Contextual Video Compression, NeurIPS 2021

FlexConv: Continuous Kernel Convolutions with Differentiable Kernel Sizes

Code release for the paper “Worldsheet Wrapping the World in a 3D Sheet for View Synthesis from a Single Image”, ICCV 2021.

Jupyter notebooks showing best practices for using cx_Oracle, the Python DB API for Oracle Database

UIUCTF 2021 Public Challenge Repository

Code for "NeuralRecon: Real-Time Coherent 3D Reconstruction from Monocular Video", CVPR 2021 oral