MixText: Linguistically-Informed Interpolation of Hidden Space for Semi-Supervised Text Classification

Overview

MixText

This repo contains codes for the following paper:

Jiaao Chen, Zichao Yang, Diyi Yang: MixText: Linguistically-Informed Interpolation of Hidden Space for Semi-Supervised Text Classification. In Proceedings of the 58th Annual Meeting of the Association of Computational Linguistics (ACL'2020)

If you would like to refer to it, please cite the paper mentioned above.

Getting Started

These instructions will get you running the codes of MixText.

Requirements

  • Python 3.6 or higher
  • Pytorch >= 1.3.0
  • Pytorch_transformers (also known as transformers)
  • Pandas, Numpy, Pickle
  • Fairseq

Code Structure

|__ data/
        |__ yahoo_answers_csv/ --> Datasets for Yahoo Answers
            |__ back_translate.ipynb --> Jupyter Notebook for back translating the dataset
            |__ classes.txt --> Classes for Yahoo Answers dataset
            |__ train.csv --> Original training dataset
            |__ test.csv --> Original testing dataset
            |__ de_1.pkl --> Back translated training dataset with German as middle language
            |__ ru_1.pkl --> Back translated training dataset with Russian as middle language

|__code/
        |__ transformers/ --> Codes copied from huggingface/transformers
        |__ read_data.py --> Codes for reading the dataset; forming labeled training set, unlabeled training set, development set and testing set; building dataloaders
        |__ normal_bert.py --> Codes for BERT baseline model
        |__ normal_train.py --> Codes for training BERT baseline model
        |__ mixtext.py --> Codes for our proposed TMix/MixText model
        |__ train.py --> Codes for training/testing TMix/MixText 

Downloading the data

Please download the dataset and put them in the data folder. You can find Yahoo Answers, AG News, DB Pedia here, IMDB here.

Pre-processing the data

For Yahoo Answer, We concatenate the question title, question content and best answer together to form the text to be classified. The pre-processed Yahoo Answer dataset can be downloaded here.

Note that for AG News and DB Pedia, we only utilize the content (without titles) to do the classifications, and for IMDB we do not perform any pre-processing.

We utilize Fairseq to perform back translation on the training dataset. Please refer to ./data/yahoo_answers_csv/back_translate.ipynb for details.

Here, we have put two examples of back translated data, de_1.pkl and ru_1.pkl, in ./data/yahoo_answers_csv/ as well. You can directly use them for Yahoo Answers or generate your own back translated data followed the ./data/yahoo_answers_csv/back_translate.ipynb.

Training models

These section contains instructions for training models on Yahoo Answers using 10 labeled data per class for training.

Training BERT baseline model

Please run ./code/normal_train.py to train the BERT baseline model (only use labeled training data):

python ./code/normal_train.py --gpu 0,1 --n-labeled 10 --data-path ./data/yahoo_answers_csv/ \
--batch-size 8 --epochs 20 

Training TMix model

Please run ./code/train.py to train the TMix model (only use labeled training data):

python ./code/train.py --gpu 0,1 --n-labeled 10 --data-path ./data/yahoo_answers_csv/ \
--batch-size 8 --batch-size-u 1 --epochs 50 --val-iteration 20 \
--lambda-u 0 --T 0.5 --alpha 16 --mix-layers-set 7 9 12 --separate-mix True 

Training MixText model

Please run ./code/train.py to train the MixText model (use both labeled and unlabeled training data):

python ./code/train.py --gpu 0,1,2,3 --n-labeled 10 \
--data-path ./data/yahoo_answers_csv/ --batch-size 4 --batch-size-u 8 --epochs 20 --val-iteration 1000 \
--lambda-u 1 --T 0.5 --alpha 16 --mix-layers-set 7 9 12 \
--lrmain 0.000005 --lrlast 0.0005
Owner
GT-SALT
Social and Language Technologies Lab
GT-SALT
Repository of Vision Transformer with Deformable Attention

Vision Transformer with Deformable Attention This repository contains the code for the paper Vision Transformer with Deformable Attention [arXiv]. Int

410 Jan 03, 2023
CodeContests is a competitive programming dataset for machine-learning

CodeContests CodeContests is a competitive programming dataset for machine-learning. This dataset was used when training AlphaCode. It consists of pro

DeepMind 1.6k Jan 08, 2023
CUda Matrix Multiply library.

cumm CUda Matrix Multiply library. cumm is developed during learning of CUTLASS, which use too much c++ template and make code unmaintainable. So I de

49 Dec 27, 2022
[AAAI 2022] Separate Contrastive Learning for Organs-at-Risk and Gross-Tumor-Volume Segmentation with Limited Annotation

A paper Introduction This is an official release of the paper Separate Contrastive Learning for Organs-at-Risk and Gross-Tumor-Volume Segmentation wit

Jiacheng Wang 14 Dec 08, 2022
Group Fisher Pruning for Practical Network Compression(ICML2021)

Group Fisher Pruning for Practical Network Compression (ICML2021) By Liyang Liu*, Shilong Zhang*, Zhanghui Kuang, Jing-Hao Xue, Aojun Zhou, Xinjiang W

Shilong Zhang 129 Dec 13, 2022
Implementation of UNet on the Joey ML framework

Independent Research Project - Code Joey can be cloned from here https://github.com/devitocodes/joey/. Devito and other dependencies such as PyTorch a

Navjot Kukreja 1 Oct 21, 2021
Employs neural networks to classify images into four categories: ship, automobile, dog or frog

Neural Net Image Classifier Employs neural networks to classify images into four categories: ship, automobile, dog or frog Viterbi_1.py uses a classic

Riley Baker 1 Jan 18, 2022
PyTorch implementation of "MLP-Mixer: An all-MLP Architecture for Vision" Tolstikhin et al. (2021)

mlp-mixer-pytorch PyTorch implementation of "MLP-Mixer: An all-MLP Architecture for Vision" Tolstikhin et al. (2021) Usage import torch from mlp_mixer

isaac 27 Jul 09, 2022
Little Ball of Fur - A graph sampling extension library for NetworKit and NetworkX (CIKM 2020)

Little Ball of Fur is a graph sampling extension library for Python. Please look at the Documentation, relevant Paper, Promo video and External Resour

Benedek Rozemberczki 619 Dec 14, 2022
Gesture-Volume-Control - This Python program can adjust the system's volume by using hand gestures

Gesture-Volume-Control This Python program can adjust the system's volume by usi

VatsalAryanBhatanagar 1 Dec 30, 2021
SSD: Single Shot MultiBox Detector pytorch implementation focusing on simplicity

SSD: Single Shot MultiBox Detector Introduction Here is my pytorch implementation of 2 models: SSD-Resnet50 and SSDLite-MobilenetV2.

Viet Nguyen 149 Jan 07, 2023
Code for the paper Task Agnostic Morphology Evolution.

Task-Agnostic Morphology Optimization This repository contains code for the paper Task-Agnostic Morphology Evolution by Donald (Joey) Hejna, Pieter Ab

Joey Hejna 18 Aug 04, 2022
Real-time Joint Semantic Reasoning for Autonomous Driving

MultiNet MultiNet is able to jointly perform road segmentation, car detection and street classification. The model achieves real-time speed and state-

Marvin Teichmann 518 Dec 12, 2022
Rapid experimentation and scaling of deep learning models on molecular and crystal graphs.

LitMatter A template for rapid experimentation and scaling deep learning models on molecular and crystal graphs. How to use Clone this repository and

Nathan Frey 32 Dec 06, 2022
Official code for MPG2: Multi-attribute Pizza Generator: Cross-domain Attribute Control with Conditional StyleGAN

This is the official code for Multi-attribute Pizza Generator (MPG2): Cross-domain Attribute Control with Conditional StyleGAN. Paper Demo Setup Envir

Fangda Han 5 Sep 01, 2022
This repository is an official implementation of the paper MOTR: End-to-End Multiple-Object Tracking with TRansformer.

MOTR: End-to-End Multiple-Object Tracking with TRansformer This repository is an official implementation of the paper MOTR: End-to-End Multiple-Object

348 Jan 07, 2023
Nvidia Semantic Segmentation monorepo

Paper | YouTube | Cityscapes Score Pytorch implementation of our paper Hierarchical Multi-Scale Attention for Semantic Segmentation. Please refer to t

NVIDIA Corporation 1.6k Jan 04, 2023
Heterogeneous Deep Graph Infomax

Heterogeneous-Deep-Graph-Infomax Parameter Setting: HDGI-A: Node-level dimension: 16 Attention head: 4 Semantic-level attention vector: 8 learning rat

52 Oct 31, 2022
Independent and minimal implementations of some reinforcement learning algorithms using PyTorch (including PPO, A3C, A2C, ...).

PyTorch RL Minimal Implementations There are implementations of some reinforcement learning algorithms, whose characteristics are as follow: Less pack

Gemini Light 4 Dec 31, 2022
MixRNet(Using mixup as regularization and tuning hyper-parameters for ResNets)

MixRNet(Using mixup as regularization and tuning hyper-parameters for ResNets) Using mixup data augmentation as reguliraztion and tuning the hyper par

Bhanu 2 Jan 16, 2022