DSEE: Dually Sparsity-embedded Efficient Tuning of Pre-trained Language Models

Last update: Dec 27, 2021

Related tags

Deep Learning machine-learning

Overview

DSEE

Codes for [Preprint] DSEE: Dually Sparsity-embedded Efficient Tuning of Pre-trained Language Models

Xuxi Chen, Tianlong Chen, Yu Cheng, Weizhu Chen, Zhangyang Wang, Ahmed Hassan Awadallahp

Overview

TBD

Requirements

We use conda to create virtual environments.

conda create -f environment.yml
conda activate dsee

Command

Unstructured DSEE

Step 0.

cd non-GPT-2
pip install -e .
cd ..

Step 1. Pre-training

Take SST-2 as example:

OUTPUT_DIR='./sst2_rank16_s1_64'
num_gpus=4
python -m torch.distributed.launch \
    --nproc_per_node=$num_gpus \
    --master_port=12345 non-GPT-2/examples/pytorch/text-classification/run_glue.py \
    --save_total_limit 10 \
    --model_name_or_path bert-base-uncased \ 
    --task_name sst2 \
    --output_dir ${OUTPUT_DIR} \
    --do_train \
    --do_eval \
    --num_train_epochs 3 \
    --save_steps 50 \
    --seed 1 \
    --per_device_train_batch_size 8 \
    --per_device_eval_batch_size 8 \
    --max_seq_length 128 \
    --overwrite_output_dir \
    --logging_steps 50 \
    --load_best_model_at_end True \
    --metric_for_best_model eval_accuracy \
    --apply_lora \
    --lora_r 16 \
    --apply_sparse \
    --num_sparse 64  \
    --learning_rate 2e-4 \
    --evaluation_strategy steps

Step 2. Pruning & Fine-tuning

OUTPUT_DIR='./sst2_rank16_s1_64_prune_0.5'
num_gpus=4
python -m torch.distributed.launch \
    --nproc_per_node=$num_gpus \
    --master_port=12335 \
    non-GPT-2/examples/pytorch/text-classification/run_glue_prune_tune.py \
    --save_total_limit 10 \
    --model_name_or_path sst2_rank16_s1_64 \
    --task_name sst2 \
    --output_dir ${OUTPUT_DIR} \
    --do_train \
    --do_eval \
    --num_train_epochs 3 \
    --save_steps 50 \
    --seed 1 \
    --per_device_train_batch_size 8 \
    --per_device_eval_batch_size 8 \
    --max_seq_length 128 \
    --overwrite_output_dir \
    --logging_steps 50 \
    --load_best_model_at_end True \
    --metric_for_best_model eval_accuracy \
    --apply_lora \
    --lora_r 16 \
    --apply_sparse \
    --num_sparse 64 \
    --learning_rate 2e-4 \
    --pruning_ratio 0.5 \
    --evaluation_strategy steps

TODO

Codes for Unstructured DSEE on GPT-2
Codes for Structured DSEE

Acknowledgement

The Huggingface's Transformers (https://github.com/huggingface/transformers)

DSEE: Dually Sparsity-embedded Efficient Tuning of Pre-trained Language Models

Related tags

Overview

DSEE

Overview

Requirements

Command

Unstructured DSEE

Step 0.

Step 1. Pre-training

Step 2. Pruning & Fine-tuning

TODO

Acknowledgement

Owner

VITA

A fast model to compute optical flow between two input images.

Active learning for Mask R-CNN in Detectron2

This repository contains a pytorch implementation of "StereoPIFu: Depth Aware Clothed Human Digitization via Stereo Vision".

Official tensorflow implementation for CVPR2020 paper “Learning to Cartoonize Using White-box Cartoon Representations”

LAVT: Language-Aware Vision Transformer for Referring Image Segmentation

CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation

A GOOD REPRESENTATION DETECTS NOISY LABELS

Using machine learning to predict undergrad college admissions.

MPViT:Multi-Path Vision Transformer for Dense Prediction

Zen-NAS: A Zero-Shot NAS for High-Performance Deep Image Recognition

Pytorch implementation of NeurIPS 2021 paper: Geometry Processing with Neural Fields.

Official Repository for the paper "Improving Baselines in the Wild".

A universal memory dumper using Frida

[CVPR-2021] UnrealPerson: An adaptive pipeline for costless person re-identification

BarcodeRattler - A Raspberry Pi Powered Barcode Reader to load a game on the Mister FPGA using MBC

Using Machine Learning to Test Causal Hypotheses in Conjoint Analysis

This is the second place solution for : UmojaHack Africa 2022: African Snake Antivenom Binding Challenge

curl-impersonate: A special compilation of curl that makes it impersonate Chrome & Firefox

mPose3D, a mmWave-based 3D human pose estimation model.

Image transformations designed for Scene Text Recognition (STR) data augmentation. Published at ICCV 2021 Workshop on Interactive Labeling and Data Augmentation for Vision.