Meta-learning for NLP

Related tags

Deep Learningmetanlp
Overview

Self-Supervised Meta-Learning for Few-Shot Natural Language Classification Tasks

Code for training the meta-learning models and fine-tuning on downstream tasks. If you use this code please cite the paper.

Paper: Self-Supervised Meta-Learning for Few-Shot Natural Language Classification Tasks

@inproceedings{bansal2020self,
  title={Self-Supervised Meta-Learning for Few-Shot Natural Language Classification Tasks},
  author={Bansal, Trapit and Jha, Rishikesh and Munkhdalai, Tsendsuren and McCallum, Andrew},
  booktitle={Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
  pages={522--534},
  year={2020}
}

Trained Models

Dependencies

  • Python version 3.6.6 or higher
  • Tensorflow version 1.12.0 (higher versions might not work)
  • Numpy 1.16.4 or higher
  • six 1.12.0

pip install -r requirements.txt should install required depedencies. It is recommended to use a conda environment and make sure to use the pip installed in the environment.

Fine-Tuning

A script is provided to run fine-tuning for a target task, by default it runs fine-tuning on CoNLL. The script will download all necessary data and models, note that in case downloads fail please download the files manually using the links.

Fine-tuning runs on a single GPU and typically takes a few minutes.

Run the script as: ./run_finetune.sh

Modify the following parameters in run_finetune.sh to run on a different task, or a different k-shot, or a different file split for the task:

  • TASK_NAME: should be one of: airline, conll, disaster, emotion, political_audience, political_bias, political_message, rating_books, rating_dvd, rating_electronics, rating_kitchen, restaurant, scitail, sentiment_books, sentiment_dvd, sentiment_electronics, sentiment_kitchen
  • DATA_DIR: path to data directory (eg., data/leopard-master/data/tf_record/${TASK_NAME})
  • F: file train split id, should be in [0, 9]
  • K: which k-shot experiment to run, should be in {4, 8, 16, 32}
  • N: number of classes in the task (see paper if not known)

So, the fine-tuning run command to run on a particular split for a task is: ./run_finetune.sh TASK_NAME F K N

To change the output directory or other arguments, edit the corresponding arguments in run_finetune.sh

Hyper-parameters for Hybrid-SMLMT

  • K = 4:
    --num_train_epochs=150*N
    --train_batch_size=4*N

  • K = 8:
    --num_train_epochs=175*N
    --train_batch_size=8*N

  • K = 16:
    --num_train_epochs=200*N
    --train_batch_size=4*N

  • K = 32:
    --num_train_epochs=100*N
    --train_batch_size=8*N

Data for fine-tuning

The data for the fine-tuning tasks can be downloaded from https://github.com/iesl/leopard

Fine-tuning on other tasks

To run fine-tuning on a different task than provided with the code, you will need to set up the train and test data for the task in a tf_record file, similar to the data for the provided tasks.

The features in the tf_record are:

name_to_features = {
      "input_ids": tf.FixedLenFeature([128], tf.int64),
      "input_mask": tf.FixedLenFeature([128], tf.int64),
      "segment_ids": tf.FixedLenFeature([128], tf.int64),
      "label_ids": tf.FixedLenFeature([], tf.int64),
  }

where:

  • input_ids: the input sequence tokenized using the BERT tokenizer
  • input_mask: mask of 0/1 corresponding to the input_ids
  • segment_ids: 0/1 segment ids following BERT
  • label_ids: classification label

Note that the above features are same as that used in the code of BERT fine-tuning for classification, so code in the BERT github repository can be used for creating the tf_record files.

The followiing arguments to run_classifier_pretrain.py need to be set:

  • task_eval_files: train_tf_record, eval_tf_record
    • where train_tf_record is the train file for the task and eval_tf_record is the test file
  • test_num_labels: number of classes in the task

LEOPARD Fine-tuning

Hyper-parameters for the LEOPARD model:

  • K = 4:
    --num_train_epochs=150*N
    --train_batch_size=2*N

  • K = 8:
    --num_train_epochs=200*N --train_batch_size=2*N

  • K = 16:
    --num_train_epochs=200*N --train_batch_size=4*N

  • K = 32:
    --num_train_epochs=50*N --train_batch_size=2*N

In addition, set the argument warp_layers=false for fine-tuning the LEOPARD model.

Meta-Training

This requires large training time and typically should be run on multiple GPU.

SMLMT data file name should begin with "meta_pretain" and end with the value of N for the tasks in that file (on file per N), for example "meta_pretrain_3.tf_record" for 3-way tasks. The training code will take train_batch_size many examples at a time starting from the beginning of the files (without shuffling) and treat that as one task for training.

Meta-training can be run using the following command:

python run_classifier_pretrain.py \
    --do_train=true \
    --task_train_files=${TRAIN_FILES} \
    --num_train_epochs=1 \
    --save_checkpoints_steps=5000 \
    --max_seq_length=128 \
    --task_eval_files=${TASK_EVAL_FILES} \
    --tasks_per_gpu=1 \
    --num_eval_tasks=1 \
    --num_gpus=4 \
    --learning_rate=1e-05 \
    --train_lr=1e-05 \
    --keep_prob=0.9 \
    --attention_probs_dropout_prob=0.1 \
    --hidden_dropout_prob=0.1 \
    --SGD_K=1 \
    --meta_batchsz=80 \
    --num_batches=8 \
    --train_batch_size=90 \
    --min_layer_with_grad=0 \
    --train_word_embeddings=true \
    --use_pooled_output=true \
    --output_layers=2 \
    --update_only_label_embedding=true \
    --use_euclidean_norm=false \
    --label_emb_size=256 \
    --stop_grad=true \
    --eval_batch_size=90 \
    --eval_examples_per_task=2000 \
    --is_meta_sgd=true \
    --data_sqrt_sampling=true \
    --deep_set_layers=0 \
    --activation_fn=tanh \
    --clip_lr=true \
    --inner_epochs=1 \
    --warp_layers=true \
    --min_inner_steps=5 \
    --average_query_every=3 \
    --weight_query_loss=true \
    --output_dir=${output_dir} \
    --pretrain_task_weight=0.5

References:

Code is based on the public repository: https://github.com/google-research/bert

Devlin, Jacob and Chang, Ming-Wei and Lee, Kenton and Toutanova, Kristina. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv preprint arXiv:1810.04805, 2018.

Owner
IESL
IESL
Multi Task Vision and Language

12-in-1: Multi-Task Vision and Language Representation Learning Please cite the following if you use this code. Code and pre-trained models for 12-in-

Facebook Research 712 Dec 19, 2022
Udacity Suse Cloud Native Foundations Scholarship Course Walkthrough

SUSE Cloud Native Foundations Scholarship Udacity is collaborating with SUSE, a global leader in true open source solutions, to empower developers and

Shivansh Srivastava 34 Oct 18, 2022
Worktory is a python library created with the single purpose of simplifying the inventory management of network automation scripts.

Worktory is a python library created with the single purpose of simplifying the inventory management of network automation scripts.

Renato Almeida de Oliveira 18 Aug 31, 2022
Official code repository for ICCV 2021 paper: Gravity-Aware Monocular 3D Human Object Reconstruction

GraviCap Official code repository for ICCV 2021 paper: Gravity-Aware Monocular 3D Human Object Reconstruction. Gravity-Aware Monocular 3D Human-Object

Rishabh Dabral 15 Dec 09, 2022
Background-Click Supervision for Temporal Action Localization

Background-Click Supervision for Temporal Action Localization This repository is the official implementation of BackTAL. In this work, we study the te

LeYang 221 Oct 09, 2022
Deep Learning Theory

Deep Learning Theory 整理了一些深度学习的理论相关内容,持续更新。 Overview Recent advances in deep learning theory 总结了目前深度学习理论研究的六个方向的一些结果,概述型,没做深入探讨(2021)。 1.1 complexity

fq 103 Jan 04, 2023
PyTorch implementation of "Simple and Deep Graph Convolutional Networks"

Simple and Deep Graph Convolutional Networks This repository contains a PyTorch implementation of "Simple and Deep Graph Convolutional Networks".(http

chenm 253 Dec 08, 2022
Generating Images with Recurrent Adversarial Networks

Generating Images with Recurrent Adversarial Networks Python (Theano) implementation of Generating Images with Recurrent Adversarial Networks code pro

Daniel Jiwoong Im 121 Sep 08, 2022
Pytorch implementation of PTNet for high-resolution and longitudinal infant MRI synthesis

Pyramid Transformer Net (PTNet) Project | Paper Pytorch implementation of PTNet for high-resolution and longitudinal infant MRI synthesis. PTNet: A Hi

Xuzhe Johnny Zhang 6 Jun 08, 2022
Open-source implementation of Google Vizier for hyper parameters tuning

Advisor Introduction Advisor is the hyper parameters tuning system for black box optimization. It is the open-source implementation of Google Vizier w

tobe 1.5k Jan 04, 2023
Code for KDD'20 "Generative Pre-Training of Graph Neural Networks"

GPT-GNN: Generative Pre-Training of Graph Neural Networks GPT-GNN is a pre-training framework to initialize GNNs by generative pre-training. It can be

Ziniu Hu 346 Dec 19, 2022
TFOD-MASKRCNN - Tensorflow MaskRCNN With Python

Tensorflow- MaskRCNN Steps git clone https://github.com/amalaj7/TFOD-MASKRCNN.gi

Amal Ajay 2 Jan 18, 2022
Codes for “A Deeply Supervised Attention Metric-Based Network and an Open Aerial Image Dataset for Remote Sensing Change Detection”

DSAMNet The pytorch implementation for "A Deeply-supervised Attention Metric-based Network and an Open Aerial Image Dataset for Remote Sensing Change

Mengxi Liu 41 Dec 14, 2022
Predict Breast Cancer Wisconsin (Diagnostic) using Naive Bayes

Naive-Bayes Predict Breast Cancer Wisconsin (Diagnostic) using Naive Bayes Downloading Data Set Use our Breast Cancer Wisconsin Data Set Also you can

Faeze Habibi 0 Apr 06, 2022
iris - Open Source Photos Platform Powered by PyTorch

Open Source Photos Platform Powered by PyTorch. Submission for PyTorch Annual Hackathon 2021.

Omkar Prabhu 137 Sep 10, 2022
An SE(3)-invariant autoencoder for generating the periodic structure of materials

Crystal Diffusion Variational AutoEncoder This software implementes Crystal Diffusion Variational AutoEncoder (CDVAE), which generates the periodic st

Tian Xie 94 Dec 10, 2022
Meaningful titles for tabs and PDF downloads! Also supports tab search.

arxiv-utils If you are a researcher that reads a lot on ArXiv, you'll benefit a lot from this web extension. Renames the title of PDF page to the pape

Johnson 174 Dec 20, 2022
Spatio-Temporal Entropy Model (STEM) for end-to-end leaned video compression.

Spatio-Temporal Entropy Model A Pytorch Reproduction of Spatio-Temporal Entropy Model (STEM) for end-to-end leaned video compression. More details can

16 Nov 28, 2022
Official implementation of the paper Label-Efficient Semantic Segmentation with Diffusion Models

Label-Efficient Semantic Segmentation with Diffusion Models Official implementation of the paper Label-Efficient Semantic Segmentation with Diffusion

Yandex Research 355 Jan 06, 2023
PyTorch Personal Trainer: My framework for deep learning experiments

Alex's PyTorch Personal Trainer (ptpt) (name subject to change) This repository contains my personal lightweight framework for deep learning projects

Alex McKinney 8 Jul 14, 2022