MixText: Linguistically-Informed Interpolation of Hidden Space for Semi-Supervised Text Classification

Overview

MixText

This repo contains codes for the following paper:

Jiaao Chen, Zichao Yang, Diyi Yang: MixText: Linguistically-Informed Interpolation of Hidden Space for Semi-Supervised Text Classification. In Proceedings of the 58th Annual Meeting of the Association of Computational Linguistics (ACL'2020)

If you would like to refer to it, please cite the paper mentioned above.

Getting Started

These instructions will get you running the codes of MixText.

Requirements

  • Python 3.6 or higher
  • Pytorch >= 1.3.0
  • Pytorch_transformers (also known as transformers)
  • Pandas, Numpy, Pickle
  • Fairseq

Code Structure

|__ data/
        |__ yahoo_answers_csv/ --> Datasets for Yahoo Answers
            |__ back_translate.ipynb --> Jupyter Notebook for back translating the dataset
            |__ classes.txt --> Classes for Yahoo Answers dataset
            |__ train.csv --> Original training dataset
            |__ test.csv --> Original testing dataset
            |__ de_1.pkl --> Back translated training dataset with German as middle language
            |__ ru_1.pkl --> Back translated training dataset with Russian as middle language

|__code/
        |__ transformers/ --> Codes copied from huggingface/transformers
        |__ read_data.py --> Codes for reading the dataset; forming labeled training set, unlabeled training set, development set and testing set; building dataloaders
        |__ normal_bert.py --> Codes for BERT baseline model
        |__ normal_train.py --> Codes for training BERT baseline model
        |__ mixtext.py --> Codes for our proposed TMix/MixText model
        |__ train.py --> Codes for training/testing TMix/MixText 

Downloading the data

Please download the dataset and put them in the data folder. You can find Yahoo Answers, AG News, DB Pedia here, IMDB here.

Pre-processing the data

For Yahoo Answer, We concatenate the question title, question content and best answer together to form the text to be classified. The pre-processed Yahoo Answer dataset can be downloaded here.

Note that for AG News and DB Pedia, we only utilize the content (without titles) to do the classifications, and for IMDB we do not perform any pre-processing.

We utilize Fairseq to perform back translation on the training dataset. Please refer to ./data/yahoo_answers_csv/back_translate.ipynb for details.

Here, we have put two examples of back translated data, de_1.pkl and ru_1.pkl, in ./data/yahoo_answers_csv/ as well. You can directly use them for Yahoo Answers or generate your own back translated data followed the ./data/yahoo_answers_csv/back_translate.ipynb.

Training models

These section contains instructions for training models on Yahoo Answers using 10 labeled data per class for training.

Training BERT baseline model

Please run ./code/normal_train.py to train the BERT baseline model (only use labeled training data):

python ./code/normal_train.py --gpu 0,1 --n-labeled 10 --data-path ./data/yahoo_answers_csv/ \
--batch-size 8 --epochs 20 

Training TMix model

Please run ./code/train.py to train the TMix model (only use labeled training data):

python ./code/train.py --gpu 0,1 --n-labeled 10 --data-path ./data/yahoo_answers_csv/ \
--batch-size 8 --batch-size-u 1 --epochs 50 --val-iteration 20 \
--lambda-u 0 --T 0.5 --alpha 16 --mix-layers-set 7 9 12 --separate-mix True 

Training MixText model

Please run ./code/train.py to train the MixText model (use both labeled and unlabeled training data):

python ./code/train.py --gpu 0,1,2,3 --n-labeled 10 \
--data-path ./data/yahoo_answers_csv/ --batch-size 4 --batch-size-u 8 --epochs 20 --val-iteration 1000 \
--lambda-u 1 --T 0.5 --alpha 16 --mix-layers-set 7 9 12 \
--lrmain 0.000005 --lrlast 0.0005
Owner
GT-SALT
Social and Language Technologies Lab
GT-SALT
A very simple tool for situations where optimization with onnx-simplifier would exceed the Protocol Buffers upper file size limit of 2GB, or simply to separate onnx files to any size you want.

sne4onnx A very simple tool for situations where optimization with onnx-simplifier would exceed the Protocol Buffers upper file size limit of 2GB, or

Katsuya Hyodo 10 Aug 30, 2022
Image morphing without reference points by applying warp maps and optimizing over them.

Differentiable Morphing Image morphing without reference points by applying warp maps and optimizing over them. Differentiable Morphing is machine lea

Alex K 380 Dec 19, 2022
Stroke-predictions-ml-model - Machine learning model to predict individuals chances of having a stroke

stroke-predictions-ml-model machine learning model to predict individuals chance

Alex Volchek 1 Jan 03, 2022
6D Grasping Policy for Point Clouds

GA-DDPG [website, paper] Installation git clone https://github.com/liruiw/GA-DDPG.git --recursive Setup: Ubuntu 16.04 or above, CUDA 10.0 or above, py

Lirui Wang 48 Dec 21, 2022
A PyTorch implementation of "Predict then Propagate: Graph Neural Networks meet Personalized PageRank" (ICLR 2019).

APPNP ⠀ A PyTorch implementation of Predict then Propagate: Graph Neural Networks meet Personalized PageRank (ICLR 2019). Abstract Neural message pass

Benedek Rozemberczki 329 Dec 30, 2022
BlueFog Tutorials

BlueFog Tutorials Welcome to the BlueFog tutorials! In this repository, we've put together a collection of awesome Jupyter notebooks. These notebooks

4 Oct 27, 2021
learned_optimization: Training and evaluating learned optimizers in JAX

learned_optimization: Training and evaluating learned optimizers in JAX learned_optimization is a research codebase for training learned optimizers. I

Google 533 Dec 30, 2022
Galactic and gravitational dynamics in Python

Gala is a Python package for Galactic and gravitational dynamics. Documentation The documentation for Gala is hosted on Read the docs. Installation an

Adrian Price-Whelan 101 Dec 22, 2022
Revisiting Video Saliency: A Large-scale Benchmark and a New Model (CVPR18, PAMI19)

DHF1K =========================================================================== Wenguan Wang, J. Shen, M.-M Cheng and A. Borji, Revisiting Video Sal

Wenguan Wang 126 Dec 03, 2022
Easily benchmark PyTorch model FLOPs, latency, throughput, max allocated memory and energy consumption

⏱ pytorch-benchmark Easily benchmark model inference FLOPs, latency, throughput, max allocated memory and energy consumption Install pip install pytor

Lukas Hedegaard 21 Dec 22, 2022
An efficient toolkit for Face Stylization based on the paper "AgileGAN: Stylizing Portraits by Inversion-Consistent Transfer Learning"

MMGEN-FaceStylor English | 简体中文 Introduction This repo is an efficient toolkit for Face Stylization based on the paper "AgileGAN: Stylizing Portraits

OpenMMLab 182 Dec 27, 2022
TensorFlow implementation of Style Transfer Generative Adversarial Networks: Learning to Play Chess Differently.

Adversarial Chess TensorFlow implementation of Style Transfer Generative Adversarial Networks: Learning to Play Chess Differently. Requirements To run

Muthu Chidambaram 30 Sep 07, 2021
This MVP data web app uses the Streamlit framework and Facebook's Prophet forecasting package to generate a dynamic forecast from your own data.

📈 Automated Time Series Forecasting Background: This MVP data web app uses the Streamlit framework and Facebook's Prophet forecasting package to gene

Zach Renwick 42 Jan 04, 2023
A repository that finds a person who looks like you by using face recognition technology.

Find Your Twin Hello everyone, I've always wondered how casting agencies do the casting for a scene where a certain actor is young or old for a movie

Cengizhan Yurdakul 3 Jan 29, 2022
Official implementation for ICDAR 2021 paper "Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer"

Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer Description Convert offline handwritten mathematical expressi

Wenqi Zhao 87 Dec 27, 2022
pytorch implementation of the ICCV'21 paper "MVTN: Multi-View Transformation Network for 3D Shape Recognition"

MVTN: Multi-View Transformation Network for 3D Shape Recognition (ICCV 2021) By Abdullah Hamdi, Silvio Giancola, Bernard Ghanem Paper | Video | Tutori

Abdullah Hamdi 64 Jan 03, 2023
Pytorch implementation of paper "Learning Co-segmentation by Segment Swapping for Retrieval and Discovery"

SegSwap Pytorch implementation of paper "Learning Co-segmentation by Segment Swapping for Retrieval and Discovery" [PDF] [Project page] If our project

xshen 41 Dec 10, 2022
DeOldify - A Deep Learning based project for colorizing and restoring old images (and video!)

DeOldify - A Deep Learning based project for colorizing and restoring old images (and video!)

Jason Antic 15.8k Jan 04, 2023
Companion code for the paper "Meta-Learning the Search Distribution of Black-Box Random Search Based Adversarial Attacks" by Yatsura et al.

META-RS This is the companion code for the paper "Meta-Learning the Search Distribution of Black-Box Random Search Based Adversarial Attacks" by Yatsu

Bosch Research 7 Dec 09, 2022
[ICML 2021] Towards Understanding and Mitigating Social Biases in Language Models

Towards Understanding and Mitigating Social Biases in Language Models This repo contains code and data for evaluating and mitigating bias from generat

Paul Liang 42 Jan 03, 2023