Source codes for the paper "Local Additivity Based Data Augmentation for Semi-supervised NER"

Overview

LADA

This repo contains codes for the following paper:

Jiaao Chen*, Zhenghui Wang*, Ran Tian, Zichao Yang, Diyi Yang: Local Additivity Based Data Augmentation for Semi-supervised NER. In Proceedings of The 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP'2020)

If you would like to refer to it, please cite the paper mentioned above.

Getting Started

These instructions will get you running the codes of LADA.

Requirements

  • Python 3.6 or higher
  • Pytorch >= 1.4.0
  • Pytorch_transformers (also known as transformers)
  • Pandas, Numpy, Pickle, faiss, sentence-transformers

Code Structure

├── code/
│   ├── BERT/
│   │   ├── back_translate.ipynb --> Jupyter Notebook for back translating the dataset
│   │   ├── bert_models.py --> Codes for LADA-based BERT models
│   │   ├── eval_utils.py --> Codes for evaluations
│   │   ├── knn.ipynb --> Jupyter Notebook for building the knn index file
│   │   ├── read_data.py --> Codes for data pre-processing
│   │   ├── train.py --> Codes for trianing BERT model
│   │   └── ...
│   ├── flair/
│   │   ├── train.py --> Codes for trianing flair model
│   │   ├── knn.ipynb --> Jupyter Notebook for building the knn index file
│   │   ├── flair/ --> the flair library
│   │   │   └── ...
│   │   ├── resources/
│   │   │   ├── docs/ --> flair library docs
│   │   │   ├── taggers/ --> save evaluation results for flair model
│   │   │   └── tasks/
│   │   │       └── conll_03/
│   │   │           ├── sent_id_knn_749.pkl --> knn index file
│   │   │           └── ... -> CoNLL-2003 dataset
│   │   └── ...
├── data/
│   └── conll2003/
│       ├── de.pkl -->Back translated training dataset with German as middle language
│       ├── labels.txt --> label index file
│       ├── sent_id_knn_700.pkl
│       └── ...  -> CoNLL-2003 dataset
├── eval/
│   └── conll2003/ --> save evaluation results for BERT model
└── README.md

BERT models

Downloading the data

Please download the CoNLL-2003 dataset and save under ./data/conll2003/ as train.txt, dev.txt, and test.txt.

Pre-processing the data

We utilize Fairseq to perform back translation on the training dataset. Please refer to ./code/BERT/back_translate.ipynb for details.

Here, we have put one example of back translated data, de.pkl, in ./data/conll2003/ . You can directly use it for CoNLL-2003 or generate your own back translated data following ./code/BERT/back_translate.ipynb.

We also provide the kNN index file for the first 700 training sentences (5%) ./data/conll2003/sent_id_knn_700.pkl. You can directly use it for CoNLL-2003 or generate your own kNN index file following ./code/BERT/knn.ipynb

Training models

These section contains instructions for training models on CoNLL-2003 using 5% training data.

Training BERT+Intra-LADA model

python ./code/BERT/train.py --data-dir 'data/conll2003' --model-type 'bert' \
--model-name 'bert-base-multilingual-cased' --output-dir 'eval/conll2003' --gpu '0,1' \
--labels 'data/conll2003/labels.txt' --max-seq-length 164 --overwrite-output-dir \
--do-train --do-eval --do-predict --evaluate-during-training --batch-size 16 \
--num-train-epochs 20 --save-steps 750 --seed 1 --train-examples 700  --eval-batch-size 128 \
--pad-subtoken-with-real-label --eval-pad-subtoken-with-first-subtoken-only --label-sep-cls \
--mix-layers-set 8 9 10  --beta 1.5 --alpha 60  --mix-option --use-knn-train-data \
--num-knn-k 5 --knn-mix-ratio 0.5 --intra-mix-ratio 1 

Training BERT+Inter-LADA model

python ./code/BERT/train.py --data-dir 'data/conll2003' --model-type 'bert' \
--model-name 'bert-base-multilingual-cased' --output-dir 'eval/conll2003' --gpu '0,1' \
--labels 'data/conll2003/labels.txt' --max-seq-length 164 --overwrite-output-dir \
--do-train --do-eval --do-predict --evaluate-during-training --batch-size 16 \
--num-train-epochs 20 --save-steps 750 --seed 1 --train-examples 700  --eval-batch-size 128 \ 
--pad-subtoken-with-real-label --eval-pad-subtoken-with-first-subtoken-only --label-sep-cls \ 
--mix-layers-set 8 9 10  --beta 1.5 --alpha 60  --mix-option --use-knn-train-data \
--num-knn-k 5 --knn-mix-ratio 0.5 --intra-mix-ratio -1  

Training BERT+Semi-Intra-LADA model

python ./code/BERT/train.py --data-dir 'data/conll2003' --model-type 'bert' \
--model-name 'bert-base-multilingual-cased' --output-dir 'eval/conll2003' --gpu '0,1' \
--labels 'data/conll2003/labels.txt' --max-seq-length 164 --overwrite-output-dir \
--do-train --do-eval --do-predict --evaluate-during-training --batch-size 16 \
--num-train-epochs 20 --save-steps 750 --seed 1 --train-examples 700  --eval-batch-size 128 \
--pad-subtoken-with-real-label --eval-pad-subtoken-with-first-subtoken-only --label-sep-cls \
--mix-layers-set 8 9 10  --beta 1.5 --alpha 60  --mix-option --use-knn-train-data \
--num-knn-k 5 --knn-mix-ratio 0.5 --intra-mix-ratio 1 \
--u-batch-size 32 --semi --T 0.6 --sharp --weight 0.05 --semi-pkl-file 'de.pkl' \
--semi-num 10000 --semi-loss 'mse' --ignore-last-n-label 4  --warmup-semi --num-semi-iter 1 \
--semi-loss-method 'origin' 

Training BERT+Semi-Inter-LADA model

python ./code/BERT/train.py --data-dir 'data/conll2003' --model-type 'bert' \
--model-name 'bert-base-multilingual-cased' --output-dir 'eval/conll2003' --gpu '0,1' \
--labels 'data/conll2003/labels.txt' --max-seq-length 164 --overwrite-output-dir \
--do-train --do-eval --do-predict --evaluate-during-training --batch-size 16 \
--num-train-epochs 20 --save-steps 750 --seed 1 --train-examples 700  --eval-batch-size 128 \ 
--pad-subtoken-with-real-label --eval-pad-subtoken-with-first-subtoken-only --label-sep-cls \
--mix-layers-set 8 9 10  --beta 1.5 --alpha 60  --mix-option --use-knn-train-data \
--num-knn-k 5 --knn-mix-ratio 0.5 --intra-mix-ratio -1 \
--u-batch-size 32 --semi --T 0.6 --sharp --weight 0.05 --semi-pkl-file 'de.pkl' \
--semi-num 10000 --semi-loss 'mse' --ignore-last-n-label 4  --warmup-semi --num-semi-iter 1 \
--semi-loss-method 'origin' 

flair models

flair is a BiLSTM-CRF sequence labeling model, and we provide code for flair+Inter-LADA

Downloading the data

Please download the CoNLL-2003 dataset and save under ./code/flair/resources/tasks/conll_03/ as eng.train, eng.testa (dev), and eng.testb (test).

Pre-processing the data

We also provide the kNN index file for the first 749 training sentences (5%, including the -DOCSTART- seperator) ./code/flair/resources/tasks/conll_03/sent_id_knn_749.pkl. You can directly use it for CoNLL-2003 or generate your own kNN index file following ./code/flair/knn.ipynb

Training models

These section contains instructions for training models on CoNLL-2003 using 5% training data.

Training flair+Inter-LADA model

CUDA_VISIBLE_DEVICES=1 python ./code/flair/train.py --use-knn-train-data --num-knn-k 5 \
--knn-mix-ratio 0.6 --train-examples 749 --mix-layer 2  --mix-option --alpha 60 --beta 1.5 \
--exp-save-name 'mix'  --mini-batch-size 64  --patience 10 --use-crf 
Owner
GT-SALT
Social and Language Technologies Lab
GT-SALT
[CVPR 2021] Anycost GANs for Interactive Image Synthesis and Editing

Anycost GAN video | paper | website Anycost GANs for Interactive Image Synthesis and Editing Ji Lin, Richard Zhang, Frieder Ganz, Song Han, Jun-Yan Zh

MIT HAN Lab 726 Dec 28, 2022
An index of recommendation algorithms that are based on Graph Neural Networks.

An index of recommendation algorithms that are based on Graph Neural Networks.

FIB LAB, Tsinghua University 564 Jan 07, 2023
Videocaptioning.pytorch - A simple implementation of video captioning

pytorch implementation of video captioning recommend installing pytorch and pyth

Yiyu Wang 2 Jan 01, 2022
Pose Transformers: Human Motion Prediction with Non-Autoregressive Transformers

Pose Transformers: Human Motion Prediction with Non-Autoregressive Transformers This is the repo used for human motion prediction with non-autoregress

Idiap Research Institute 26 Dec 14, 2022
This is a repo of basic Machine Learning!

Basic Machine Learning This repository contains a topic-wise curated list of Machine Learning and Deep Learning tutorials, articles and other resource

Ekram Asif 53 Dec 31, 2022
Python implementation of ADD: Frequency Attention and Multi-View based Knowledge Distillation to Detect Low-Quality Compressed Deepfake Images, AAAI2022.

ADD: Frequency Attention and Multi-View based Knowledge Distillation to Detect Low-Quality Compressed Deepfake Images Binh M. Le & Simon S. Woo, "ADD:

2 Oct 24, 2022
Gluon CV Toolkit

Gluon CV Toolkit | Installation | Documentation | Tutorials | GluonCV provides implementations of the state-of-the-art (SOTA) deep learning models in

Distributed (Deep) Machine Learning Community 5.4k Jan 06, 2023
bio_inspired_min_nets_improve_the_performance_and_robustness_of_deep_networks

Code Submission for: Bio-inspired Min-Nets Improve the Performance and Robustness of Deep Networks Run with docker To build a docker environment, chan

0 Dec 09, 2021
Using Tensorflow Object Detection API to detect Waymo open dataset

Waymo-2D-Object-Detection Using Tensorflow Object Detection API to detect Waymo open dataset Result CenterNet Training Loss SSD ResNet Training Loss C

76 Dec 12, 2022
JAX + dataclasses

jax_dataclasses jax_dataclasses provides a wrapper around dataclasses.dataclass for use in JAX, which enables automatic support for: Pytree registrati

Brent Yi 35 Dec 21, 2022
Meta Language-Specific Layers in Multilingual Language Models

Meta Language-Specific Layers in Multilingual Language Models This repo contains the source codes for our paper On Negative Interference in Multilingu

Zirui Wang 20 Feb 13, 2022
The official implementation of ICCV paper "Box-Aware Feature Enhancement for Single Object Tracking on Point Clouds".

Box-Aware Tracker (BAT) Pytorch-Lightning implementation of the Box-Aware Tracker. Box-Aware Feature Enhancement for Single Object Tracking on Point C

Kangel Zenn 5 Mar 26, 2022
TensorFlow tutorials and best practices.

Effective TensorFlow 2 Table of Contents Part I: TensorFlow 2 Fundamentals TensorFlow 2 Basics Broadcasting the good and the ugly Take advantage of th

Vahid Kazemi 8.7k Dec 31, 2022
NeoDTI: Neural integration of neighbor information from a heterogeneous network for discovering new drug-target interactions

NeoDTI NeoDTI: Neural integration of neighbor information from a heterogeneous network for discovering new drug-target interactions (Bioinformatics).

62 Nov 26, 2022
Graph parsing approach to structured sentiment analysis.

Fine-grained Sentiment Analysis as Dependency Graph Parsing This repository contains the code and datasets described in following paper: Fine-grained

Jeremy Barnes 36 Dec 12, 2022
Implementation for "Exploiting Aliasing for Manga Restoration" (CVPR 2021)

[CVPR Paper](To appear) | [Project Website](To appear) | BibTex Introduction As a popular entertainment art form, manga enriches the line drawings det

133 Dec 15, 2022
This script runs neural style transfer against the provided content image.

Neural Style Transfer Content Style Output Description: This script runs neural style transfer against the provided content image. The content image m

Martynas Subonis 0 Nov 25, 2021
Multi-Stage Progressive Image Restoration

Multi-Stage Progressive Image Restoration Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Ming-Hsuan Yang, and Ling Sh

Syed Waqas Zamir 859 Dec 22, 2022
Styled Handwritten Text Generation with Transformers (ICCV 21)

⚡ Handwriting Transformers [PDF] Ankan Kumar Bhunia, Salman Khan, Hisham Cholakkal, Rao Muhammad Anwer, Fahad Shahbaz Khan & Mubarak Shah Abstract: We

Ankan Kumar Bhunia 85 Dec 22, 2022
Multi-Modal Fingerprint Presentation Attack Detection: Evaluation On A New Dataset

PADISI USC Dataset This repository analyzes the PADISI-Finger dataset introduced in Multi-Modal Fingerprint Presentation Attack Detection: Evaluation

USC ISI VISTA Computer Vision 6 Feb 06, 2022