Meta-learning for NLP

Related tags

Deep Learningmetanlp
Overview

Self-Supervised Meta-Learning for Few-Shot Natural Language Classification Tasks

Code for training the meta-learning models and fine-tuning on downstream tasks. If you use this code please cite the paper.

Paper: Self-Supervised Meta-Learning for Few-Shot Natural Language Classification Tasks

@inproceedings{bansal2020self,
  title={Self-Supervised Meta-Learning for Few-Shot Natural Language Classification Tasks},
  author={Bansal, Trapit and Jha, Rishikesh and Munkhdalai, Tsendsuren and McCallum, Andrew},
  booktitle={Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
  pages={522--534},
  year={2020}
}

Trained Models

Dependencies

  • Python version 3.6.6 or higher
  • Tensorflow version 1.12.0 (higher versions might not work)
  • Numpy 1.16.4 or higher
  • six 1.12.0

pip install -r requirements.txt should install required depedencies. It is recommended to use a conda environment and make sure to use the pip installed in the environment.

Fine-Tuning

A script is provided to run fine-tuning for a target task, by default it runs fine-tuning on CoNLL. The script will download all necessary data and models, note that in case downloads fail please download the files manually using the links.

Fine-tuning runs on a single GPU and typically takes a few minutes.

Run the script as: ./run_finetune.sh

Modify the following parameters in run_finetune.sh to run on a different task, or a different k-shot, or a different file split for the task:

  • TASK_NAME: should be one of: airline, conll, disaster, emotion, political_audience, political_bias, political_message, rating_books, rating_dvd, rating_electronics, rating_kitchen, restaurant, scitail, sentiment_books, sentiment_dvd, sentiment_electronics, sentiment_kitchen
  • DATA_DIR: path to data directory (eg., data/leopard-master/data/tf_record/${TASK_NAME})
  • F: file train split id, should be in [0, 9]
  • K: which k-shot experiment to run, should be in {4, 8, 16, 32}
  • N: number of classes in the task (see paper if not known)

So, the fine-tuning run command to run on a particular split for a task is: ./run_finetune.sh TASK_NAME F K N

To change the output directory or other arguments, edit the corresponding arguments in run_finetune.sh

Hyper-parameters for Hybrid-SMLMT

  • K = 4:
    --num_train_epochs=150*N
    --train_batch_size=4*N

  • K = 8:
    --num_train_epochs=175*N
    --train_batch_size=8*N

  • K = 16:
    --num_train_epochs=200*N
    --train_batch_size=4*N

  • K = 32:
    --num_train_epochs=100*N
    --train_batch_size=8*N

Data for fine-tuning

The data for the fine-tuning tasks can be downloaded from https://github.com/iesl/leopard

Fine-tuning on other tasks

To run fine-tuning on a different task than provided with the code, you will need to set up the train and test data for the task in a tf_record file, similar to the data for the provided tasks.

The features in the tf_record are:

name_to_features = {
      "input_ids": tf.FixedLenFeature([128], tf.int64),
      "input_mask": tf.FixedLenFeature([128], tf.int64),
      "segment_ids": tf.FixedLenFeature([128], tf.int64),
      "label_ids": tf.FixedLenFeature([], tf.int64),
  }

where:

  • input_ids: the input sequence tokenized using the BERT tokenizer
  • input_mask: mask of 0/1 corresponding to the input_ids
  • segment_ids: 0/1 segment ids following BERT
  • label_ids: classification label

Note that the above features are same as that used in the code of BERT fine-tuning for classification, so code in the BERT github repository can be used for creating the tf_record files.

The followiing arguments to run_classifier_pretrain.py need to be set:

  • task_eval_files: train_tf_record, eval_tf_record
    • where train_tf_record is the train file for the task and eval_tf_record is the test file
  • test_num_labels: number of classes in the task

LEOPARD Fine-tuning

Hyper-parameters for the LEOPARD model:

  • K = 4:
    --num_train_epochs=150*N
    --train_batch_size=2*N

  • K = 8:
    --num_train_epochs=200*N --train_batch_size=2*N

  • K = 16:
    --num_train_epochs=200*N --train_batch_size=4*N

  • K = 32:
    --num_train_epochs=50*N --train_batch_size=2*N

In addition, set the argument warp_layers=false for fine-tuning the LEOPARD model.

Meta-Training

This requires large training time and typically should be run on multiple GPU.

SMLMT data file name should begin with "meta_pretain" and end with the value of N for the tasks in that file (on file per N), for example "meta_pretrain_3.tf_record" for 3-way tasks. The training code will take train_batch_size many examples at a time starting from the beginning of the files (without shuffling) and treat that as one task for training.

Meta-training can be run using the following command:

python run_classifier_pretrain.py \
    --do_train=true \
    --task_train_files=${TRAIN_FILES} \
    --num_train_epochs=1 \
    --save_checkpoints_steps=5000 \
    --max_seq_length=128 \
    --task_eval_files=${TASK_EVAL_FILES} \
    --tasks_per_gpu=1 \
    --num_eval_tasks=1 \
    --num_gpus=4 \
    --learning_rate=1e-05 \
    --train_lr=1e-05 \
    --keep_prob=0.9 \
    --attention_probs_dropout_prob=0.1 \
    --hidden_dropout_prob=0.1 \
    --SGD_K=1 \
    --meta_batchsz=80 \
    --num_batches=8 \
    --train_batch_size=90 \
    --min_layer_with_grad=0 \
    --train_word_embeddings=true \
    --use_pooled_output=true \
    --output_layers=2 \
    --update_only_label_embedding=true \
    --use_euclidean_norm=false \
    --label_emb_size=256 \
    --stop_grad=true \
    --eval_batch_size=90 \
    --eval_examples_per_task=2000 \
    --is_meta_sgd=true \
    --data_sqrt_sampling=true \
    --deep_set_layers=0 \
    --activation_fn=tanh \
    --clip_lr=true \
    --inner_epochs=1 \
    --warp_layers=true \
    --min_inner_steps=5 \
    --average_query_every=3 \
    --weight_query_loss=true \
    --output_dir=${output_dir} \
    --pretrain_task_weight=0.5

References:

Code is based on the public repository: https://github.com/google-research/bert

Devlin, Jacob and Chang, Ming-Wei and Lee, Kenton and Toutanova, Kristina. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv preprint arXiv:1810.04805, 2018.

Owner
IESL
IESL
Bringing Computer Vision and Flutter together , to build an awesome app !!

Bringing Computer Vision and Flutter together , to build an awesome app !! Explore the Directories Flutter · Machine Learning Table of Contents About

Padmanabha Banerjee 14 Apr 07, 2022
Official repo for BMVC2021 paper ASFormer: Transformer for Action Segmentation

ASFormer: Transformer for Action Segmentation This repo provides training & inference code for BMVC 2021 paper: ASFormer: Transformer for Action Segme

42 Dec 23, 2022
StyleGAN2-ADA - Official PyTorch implementation

Abstract: Training generative adversarial networks (GAN) using too little data typically leads to discriminator overfitting, causing training to diverge. We propose an adaptive discriminator augmenta

NVIDIA Research Projects 3.2k Dec 30, 2022
BC3407-Group-5-Project - BC3407 Group Project With Python

BC3407-Group-5-Project As the world struggles to contain the ever-changing varia

1 Jan 26, 2022
Group Activity Recognition with Clustered Spatial Temporal Transformer

GroupFormer Group Activity Recognition with Clustered Spatial-TemporalTransformer Backbone Style Action Acc Activity Acc Config Download Inv3+flow+pos

28 Dec 12, 2022
Learning cell communication from spatial graphs of cells

ncem Features Repository for the manuscript Fischer, D. S., Schaar, A. C. and Theis, F. Learning cell communication from spatial graphs of cells. 2021

Theis Lab 77 Dec 30, 2022
A High-Level Fusion Scheme for Circular Quantities published at the 20th International Conference on Advanced Robotics

Monte Carlo Simulation to the Paper A High-Level Fusion Scheme for Circular Quantities published at the 20th International Conference on Advanced Robotics

Sören Kohnert 0 Dec 06, 2021
Rethinking Nearest Neighbors for Visual Classification

Rethinking Nearest Neighbors for Visual Classification arXiv Environment settings Check out scripts/env_setup.sh Setup data Download the following fin

Menglin Jia 29 Oct 11, 2022
SAGE: Sensitivity-guided Adaptive Learning Rate for Transformers

SAGE: Sensitivity-guided Adaptive Learning Rate for Transformers This repo contains our codes for the paper "No Parameters Left Behind: Sensitivity Gu

Chen Liang 23 Nov 07, 2022
Part-Aware Data Augmentation for 3D Object Detection in Point Cloud

Part-Aware Data Augmentation for 3D Object Detection in Point Cloud This repository contains a reference implementation of our Part-Aware Data Augment

Jaeseok Choi 62 Jan 03, 2023
you can add any codes in any language by creating its respective folder (if already not available).

HACKTOBERFEST-2021-WEB-DEV Beginner-Hacktoberfest Need Your first pr for hacktoberfest 2k21 ? come on in About This is repository of Responsive Portfo

Suman Sharma 8 Oct 17, 2022
This repository stores the code to reproduce the results published in "TiWS-iForest: Isolation Forest in Weakly Supervised and Tiny ML scenarios"

TinyWeaklyIsolationForest This repository stores the code to reproduce the results published in "TiWS-iForest: Isolation Forest in Weakly Supervised a

2 Mar 21, 2022
Learning Time-Critical Responses for Interactive Character Control

Learning Time-Critical Responses for Interactive Character Control Abstract This code implements the paper Learning Time-Critical Responses for Intera

Movement Research Lab 227 Dec 31, 2022
Code for all the Advent of Code'21 challenges mostly written in python

Advent of Code 21 Code for all the Advent of Code'21 challenges mostly written in python. They are not necessarily the best or fastest solutions but j

4 May 26, 2022
Pretrained Pytorch face detection (MTCNN) and recognition (InceptionResnet) models

Face Recognition Using Pytorch Python 3.7 3.6 3.5 Status This is a repository for Inception Resnet (V1) models in pytorch, pretrained on VGGFace2 and

Tim Esler 3.3k Jan 04, 2023
Kaggle Ultrasound Nerve Segmentation competition [Keras]

Ultrasound nerve segmentation using Keras (1.0.7) Kaggle Ultrasound Nerve Segmentation competition [Keras] #Install (Ubuntu {14,16}, GPU) cuDNN requir

179 Dec 28, 2022
Technical Analysis library in pandas for backtesting algotrading and quantitative analysis

bta-lib - A pandas based Technical Analysis Library bta-lib is pandas based technical analysis library and part of the backtrader family. Links Main P

DRo 393 Dec 20, 2022
adversarial_multi_armed_bandit_variable_plays

Adversarial Multi-Armed Bandit with Variable Plays This code is for paper: Adversarial Online Learning with Variable Plays in the Evasion-and-Pursuit

Yiyang Wang 1 Oct 28, 2021
The personal repository of the work: *DanceNet3D: Music Based Dance Generation with Parametric Motion Transformer*.

DanceNet3D The personal repository of the work: DanceNet3D: Music Based Dance Generation with Parametric Motion Transformer. Dataset and Results Pleas

南嘉Nanga 36 Dec 21, 2022
A Python package to process & model ChEMBL data.

insilico: A Python package to process & model ChEMBL data. ChEMBL is a manually curated chemical database of bioactive molecules with drug-like proper

Steven Newton 0 Dec 09, 2021