Code for T-Few from "Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning"

Related tags

Deep Learningt-few
Overview

T-Few

This repository contains the official code for the paper: "Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning".

This method outperforms in-context learning with GPT-3 and achieves state-of-the-art on "RAFT".

Setup

First, create a virtual environment for the project and install all the requirments. (We use conda to manage environments. Be sure to install and initialize conda first.)

  1. Create a virtual environment with python 3.7 conda create -n tfew python==3.7, then activate the environment conda activate tfew.
  2. Install other dependencies. pip install -r requirements.txt -f https://download.pytorch.org/whl/cu113/torch_stable.html
  3. If you plan to run SAID, then install dependencies with python src/intrinsic_said_setup.py develop. Otherwise, skip this step.

The steps above only needs to be done once. In addition, every time you start a new session, you will need to run . bin/start.sh

Run your first experiment

Once you finished setting up the environment, you can try running CUDA_VISIBLE_DEVICES=3 python -m src.pl_train -c t0.json+rte.json -k save_model=False exp_name=first_exp The outputs of this run will be saved to ${OUTPUT_PATH}/first_exp/, which is usually /t-few/exp_out/first_exp/. Here, first_exp is the experiment name, you can run more experiments with different expeirment names. The code will automatically skip finished experiments. (However, if you wish to rerun a finished experiment under the same experiment name, you will need to manually remove the corresponding files in the output directory.)

There are two ways to control an experiment.

  1. You can specify config files with -c. Multiple config files can be combined with +. (When there are conflits, config terms from the config file on the right will have greater power.) This will be convinient when you have multiple terms that forms a fixed group.
  2. You can override values with -k. This will be convinient when you need to change a small number of terms.

It is recommended to use GPUs with 40GB to train T0(3B) and 80GB to train T0

Run an array of experiments

In this project, we often need to run a large number of experiments. Here is an example bash script bin/few-shot-pretrained-3b-100k.sh to fine-tune 3B pre-trained (IA)3 on all datasets.

This should take a few hours. After that, you can use scripts/get_results_table.py to generate a csv summary.

Citation

If you find this repo helpful, welcome to cite our work:

@article{liu2020tfew,
  title={Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning},
  author={Liu, Haokun and Tam, Derek and Muqeeth, Mohammed and Mohta, Jay and Huang, Tenghao and Bansal, Mohit and Raffel, Colin},
  journal={arXiv preprint arXiv:2205.05638},
  year={2022}
}

We use the following code in our works:

@article{mahabadi2021compacter,
  title={Compacter: Efficient low-rank hypercomplex adapter layers},
  author={Mahabadi, Rabeeh Karimi and Henderson, James and Ruder, Sebastian},
  journal={arXiv preprint arXiv:2106.04647},
  year={2021}
}

@article{sung2021training,
  title={Training Neural Networks with Fixed Sparse Masks},
  author={Sung, Yi-Lin and Nair, Varun and Raffel, Colin},
  journal={arXiv preprint arXiv:2111.09839},
  year={2021}
}

@article{aghajanyan2020intrinsic,
  title={Intrinsic dimensionality explains the effectiveness of language model fine-tuning},
  author={Aghajanyan, Armen and Zettlemoyer, Luke and Gupta, Sonal},
  journal={arXiv preprint arXiv:2012.13255},
  year={2020}
}
Greedy Gaussian Segmentation

GGS Greedy Gaussian Segmentation (GGS) is a Python solver for efficiently segmenting multivariate time series data. For implementation details, please

Stanford University Convex Optimization Group 72 Dec 07, 2022
This toolkit provides codes to download and pre-process the SLUE datasets, train the baseline models, and evaluate SLUE tasks.

slue-toolkit We introduce Spoken Language Understanding Evaluation (SLUE) benchmark. This toolkit provides codes to download and pre-process the SLUE

ASAPP Research 39 Sep 21, 2022
Official Pytorch implementation of "Learning to Estimate Robust 3D Human Mesh from In-the-Wild Crowded Scenes", CVPR 2022

Learning to Estimate Robust 3D Human Mesh from In-the-Wild Crowded Scenes / 3DCrowdNet News 💪 3DCrowdNet achieves the state-of-the-art accuracy on 3D

Hongsuk Choi 113 Dec 21, 2022
Complete the code of prefix-tuning in low data setting

Prefix Tuning Note: 作者在论文中提到使用真实的word去初始化prefix的操作(Initializing the prefix with activations of real words,significantly improves generation)。我在使用作者提供的

Andrew Zeng 4 Jul 11, 2022
Leveraging Two Types of Global Graph for Sequential Fashion Recommendation, ICMR 2021

This is the repo for the paper: Leveraging Two Types of Global Graph for Sequential Fashion Recommendation Requirements OS: Ubuntu 16.04 or higher ver

Yujuan Ding 10 Oct 10, 2022
Unofficial PyTorch Implementation for HifiFace (https://arxiv.org/abs/2106.09965)

HifiFace — Unofficial Pytorch Implementation Image source: HifiFace: 3D Shape and Semantic Prior Guided High Fidelity Face Swapping (figure 1, pg. 1)

MINDs Lab 218 Jan 04, 2023
University of Rochester 2021 Summer REU focusing on music sentiment transfer using CycleGAN

Music-Sentiment-Transfer University of Rochester 2021 Summer REU focusing on music sentiment transfer using CycleGAN Poster: Music Sentiment Transfer

Miles Sigel 2 Jan 24, 2022
A fast and easy to use, moddable, Python based Minecraft server!

PyMine PyMine - The fastest, easiest to use, Python-based Minecraft Server! Features Note: This list is not always up to date, and doesn't contain all

PyMine 144 Dec 30, 2022
Official Code Release for Container : Context Aggregation Network

Container: Context Aggregation Network Official Code Release for Container : Context Aggregation Network Comparion between CNN, MLP-Mixer and Transfor

peng gao 42 Nov 17, 2021
Distributionally robust neural networks for group shifts

Distributionally Robust Neural Networks for Group Shifts: On the Importance of Regularization for Worst-Case Generalization This code implements the g

151 Dec 25, 2022
Official code for "Towards An End-to-End Framework for Flow-Guided Video Inpainting" (CVPR2022)

E2FGVI (CVPR 2022) English | 简体中文 This repository contains the official implementation of the following paper: Towards An End-to-End Framework for Flo

Media Computing Group @ Nankai University 537 Jan 07, 2023
The repo contains the code of the ACL2020 paper `Dice Loss for Data-imbalanced NLP Tasks`

Dice Loss for NLP Tasks This repository contains code for Dice Loss for Data-imbalanced NLP Tasks at ACL2020. Setup Install Package Dependencies The c

223 Dec 17, 2022
A python-image-classification web application project, written in Python and served through the Flask Microframework

A python-image-classification web application project, written in Python and served through the Flask Microframework. This Project implements the VGG16 covolutional neural network, through Keras and

Gerald Maduabuchi 19 Dec 12, 2022
TensorFlow tutorials and best practices.

Effective TensorFlow 2 Table of Contents Part I: TensorFlow 2 Fundamentals TensorFlow 2 Basics Broadcasting the good and the ugly Take advantage of th

Vahid Kazemi 8.7k Dec 31, 2022
Autoregressive Predictive Coding: An unsupervised autoregressive model for speech representation learning

Autoregressive Predictive Coding This repository contains the official implementation (in PyTorch) of Autoregressive Predictive Coding (APC) proposed

iamyuanchung 173 Dec 18, 2022
Progressive Coordinate Transforms for Monocular 3D Object Detection

Progressive Coordinate Transforms for Monocular 3D Object Detection This repository is the official implementation of PCT. Introduction In this paper,

58 Nov 06, 2022
SEJE Pytorch implementation

SEJE is a prototype for the paper Learning Text-Image Joint Embedding for Efficient Cross-Modal Retrieval with Deep Feature Engineering. Contents Inst

0 Oct 21, 2021
Official codebase for Legged Robots that Keep on Learning: Fine-Tuning Locomotion Policies in the Real World

Legged Robots that Keep on Learning Official codebase for Legged Robots that Keep on Learning: Fine-Tuning Locomotion Policies in the Real World, whic

Laura Smith 70 Dec 07, 2022
Building blocks for uncertainty-aware cycle consistency presented at NeurIPS'21.

UncertaintyAwareCycleConsistency This repository provides the building blocks and the API for the work presented in the NeurIPS'21 paper Robustness vi

EML Tübingen 19 Dec 12, 2022
VLG-Net: Video-Language Graph Matching Networks for Video Grounding

VLG-Net: Video-Language Graph Matching Networks for Video Grounding Introduction Official repository for VLG-Net: Video-Language Graph Matching Networ

Mattia Soldan 25 Dec 04, 2022