Revisiting Self-Training for Few-Shot Learning of Language Model.

Last update: Nov 19, 2022

Related tags

Overview

SFLM

This is the implementation of the paper Revisiting Self-Training for Few-Shot Learning of Language Model. SFLM is short for self-training for few-shot learning of language model.

Requirements

To run our code, please install all the dependency packages by using the following command:

pip install -r requirements.txt

Preprocess

The original data can be found from LM-BFF. To generate data for the few-shot experiments, please run the below command:

python tools/generate_data.py

The original data shall be in ./data/original, and the sampled data will be in ./data/few-shot/$K-$MU-$SEED. Please refer to ./tools/generate_data.py for more options.

Train

Our code can be run as the below example:

python3 run.py \
  --task_name SST-2 \
  --data_dir data/few-shot/SST-2/16-4-100 \
  --do_train \
  --do_eval \
  --do_predict \
  --evaluate_during_training \
  --model_name_or_path roberta-base \
  --few_shot_type prompt-demo \
  --num_k 16 \
  --max_seq_length 256 \
  --per_device_train_batch_size 2 \
  --per_device_eval_batch_size 16 \
  --gradient_accumulation_steps 4 \
  --learning_rate 1e-5 \
  --max_steps 1000 \
  --logging_steps 100 \
  --eval_steps 100 \
  --num_train_epochs 0 \
  --output_dir result/SST-2-16-4-100 \
  --save_logit_dir result/SST-2-16-4-100 \
  --seed 100 \
  --template "*cls**sent_0*_It_was*mask*.*sep+*" \
  --mapping "{'0':'terrible','1':'great'}" \
  --num_sample 16 \
  --threshold 0.95 \
  --lam1 0.5 \
  --lam2 0.1

Most arguments are the same as LM-BFF, and the same manual prompts are used in our experiments. We list additional arguments used in SFLM:

threshold: The threshold used to filter out low-confidence samples for self-training loss
lam1: The weight of self-training loss
lam2: The weight of self-supervised loss

Citation

Please cite our paper if you use SFLM in your work:

@inproceedings{chen2021revisit,        
    title={Revisiting Self-Training for Few-Shot Learning of Language Model},         
    author={Chen, Yiming and Zhang, Yan and Zhang, Chen and Lee, Grandee and Cheng, Ran and Li, Haizhou},         
    booktitle={EMNLP},        
    year={2021},
}

Acknowledgements

Code is implemented based on LM-BFF. We would like to thank the authors of LM-BFF for making their code public.

Revisiting Self-Training for Few-Shot Learning of Language Model.

Related tags

Overview

SFLM

Requirements

Preprocess

Train

Citation

Acknowledgements

Owner

Unofficial TensorFlow implementation of Protein Interface Prediction using Graph Convolutional Networks.

Joint learning of images and text via maximization of mutual information

[CIKM 2021] Enhancing Aspect-Based Sentiment Analysis with Supervised Contrastive Learning

Specificity-preserving RGB-D Saliency Detection

Implementation for "Manga Filling Style Conversion with Screentone Variational Autoencoder" (SIGGRAPH ASIA 2020 issue)

Combining Reinforcement Learning and Constraint Programming for Combinatorial Optimization

Use MATLAB to simulate the signal and extract features. Use PyTorch to build and train deep network to do spectrum sensing.

Run PowerShell command without invoking powershell.exe

ZEBRA: Zero Evidence Biometric Recognition Assessment

This PyTorch package implements MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided Adaptation (NAACL 2022).

Evaluating different engineering tricks that make RL work

Finetuning Pipeline

Arbitrary Distribution Modeling with Censorship in Real Time 59 2 60 3 Bidding Advertising for KDD'21

A comprehensive list of published machine learning applications to cosmology

Pytorch implementation of "Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech"

Minecraft agent to farm resources using reinforcement learning

The final project for "Applying AI to Wearable Device Data" course from "AI for Healthcare" - Udacity.

Block Sparse movement pruning

A PyTorch Implementation of Single Shot MultiBox Detector

MAVE: : A Product Dataset for Multi-source Attribute Value Extraction