Improving Non-autoregressive Generation with Mixup Training

Related tags

Deep LearningMIST
Overview

MIST

Training MIST

TRAIN_FILE=/your/path/to/train.json
VALID_FILE=/your/path/to/valid.json
OUTPUT_DIR=/your/path/to/save_checkpoints
CACHE_DIR=/your/path/to/transformer_package_cache

MODEL_PATH=bert-base-uncased or models/unilm1.2-base-uncased

# squadqg 30005 steps
# squadqg 50005 steps
# xsum 600005 steps
STEPS=30005

python -m torch.distributed.launch --nproc_per_node=4 train.py\
  --train_file $TRAIN_FILE\
  --valid_file $VALID_FILE\
  --output_dir $OUTPUT_PATH\
  --model_type nat --model_name_or_path $MODEL_PATH\
  --do_lower_case --max_source_seq_length 464 --max_target_seq_length 48\
  --per_gpu_train_batch_size 16 --gradient_accumulation_steps 1\
  --learning_rate 3e-5 --num_warmup_steps 500 --num_training_steps $STEPS\
  --cache_dir $CACHE_DIR\
  --log_dir ${OUTPUT_PATH}/log\
  --keep_prob 0.0\
  --random_prob 0.0\
  --use_glat\
  --tqdm_miniters 100\
  --cotrain_put_target_in_source\ 
  --cotrain_put_target_in_source_same_bert\ 
  --wandb\ # logging with wandb
  --fp16\
  --fp16_opt_level O2

Removing the cotrain_put_target_in_source and cotrain_put_target_in_source_same_bert flags to reproduce the results without MIST.

Download Unilm

mkdir -p models/unilm1.2-base-uncased
cd models/unilm1.2-base-uncased
wget https://unilm.blob.core.windows.net/ckpt/unilm1.2-base-uncased.bin -O pytorch_model.bin
wget https://unilm.blob.core.windows.net/ckpt/unilm1.2-base-uncased-vocab.txt -O vocab.txt
wget https://unilm.blob.core.windows.net/ckpt/unilm1.2-base-uncased-config.json -O config.json

Download datasets

Json dataset links: squadqg, xsum and quora

Training NAT MASS

To reproduce the results of NAT MASS, please refer to the ./MASS-NAT/mass-nat.sh

Collective Multi-type Entity Alignment Between Knowledge Graphs (WWW'20)

CG-MuAlign A reference implementation for "Collective Multi-type Entity Alignment Between Knowledge Graphs", published in WWW 2020. If you find our pa

Bran Zhu 28 Dec 11, 2022
PyStan, a Python interface to Stan, a platform for statistical modeling. Documentation: https://pystan.readthedocs.io

PyStan NOTE: This documentation describes a BETA release of PyStan 3. PyStan is a Python interface to Stan, a package for Bayesian inference. Stan® is

Stan 229 Dec 29, 2022
A simple baseline for the 2022 IEEE GRSS Data Fusion Contest (DFC2022)

DFC2022 Baseline A simple baseline for the 2022 IEEE GRSS Data Fusion Contest (DFC2022) This repository uses TorchGeo, PyTorch Lightning, and Segmenta

isaac 24 Nov 28, 2022
Simple STAC Catalogs discovery tool.

STAC Catalog Discovery Simple STAC discovery tool. Just paste the STAC Catalog link and press Enter. Details STAC Discovery tool enables discovering d

Mykola Kozyr 21 Oct 19, 2022
Dogs classification with Deep Metric Learning using some popular losses

Tsinghua Dogs classification with Deep Metric Learning 1. Introduction Tsinghua Dogs dataset Tsinghua Dogs is a fine-grained classification dataset fo

QuocThangNguyen 45 Nov 09, 2022
Image Segmentation Evaluation

Image Segmentation Evaluation Martin Keršner, [email protected] Evaluation

Martin Kersner 273 Oct 28, 2022
Speed-Test - You can check your intenet speed using this tool

Speed-Test Tool By Hez_X AVAILABLE ON : Termux & Kali linux & Ubuntu (Linux E

Hez-X 3 Feb 17, 2022
The Pytorch implementation for "Video-Text Pre-training with Learned Regions"

Region_Learner The Pytorch implementation for "Video-Text Pre-training with Learned Regions" (arxiv) We are still cleaning up the code further and pre

Rui Yan 0 Mar 20, 2022
BuildingNet: Learning to Label 3D Buildings

BuildingNet This is the implementation of the BuildingNet architecture described in this paper: Paper: BuildingNet: Learning to Label 3D Buildings Arx

16 Nov 07, 2022
Code implementation for the paper 'Conditional Gaussian PAC-Bayes'.

CondGauss This repository contains PyTorch code for the paper Stochastic Gaussian PAC-Bayes. A novel PAC-Bayesian training method is implemented. Ther

0 Nov 01, 2021
Pytorch Implementation for Dilated Continuous Random Field

DilatedCRF Pytorch implementation for fully-learnable DilatedCRF. If you find my work helpful, please consider our paper: @article{Mo2022dilatedcrf,

DunnoCoding_Plus 3 Nov 13, 2022
PyTorch inference for "Progressive Growing of GANs" with CelebA snapshot

Progressive Growing of GANs inference in PyTorch with CelebA training snapshot Description This is an inference sample written in PyTorch of the origi

320 Nov 21, 2022
Deep Residual Learning for Image Recognition

Deep Residual Learning for Image Recognition This is a Torch implementation of "Deep Residual Learning for Image Recognition",Kaiming He, Xiangyu Zhan

Kimmy 561 Dec 01, 2022
An efficient framework for reinforcement learning.

rl: An efficient framework for reinforcement learning Requirements Introduction PPO Test Requirements name version Python =3.7 numpy =1.19 torch =1

16 Nov 30, 2022
SE3 Pose Interp - Interpolate camera pose or trajectory in SE3, pose interpolation, trajectory interpolation

SE3 Pose Interpolation Pose estimated from SLAM system are always discrete, and

Ran Cheng 4 Dec 15, 2022
Fast, accurate and reliable software for algebraic CT reconstruction

KCT CBCT Fast, accurate and reliable software for algebraic CT reconstruction. This set of software tools includes OpenCL implementation of modern CT

Vojtěch Kulvait 4 Dec 14, 2022
Accelerated SMPL operation, commonly used in generate 3D human mesh, STAR included.

SMPL2 An enchanced and accelerated SMPL operation which commonly used in 3D human mesh generation. It takes a poses, shapes, cam_trans as inputs, outp

JinTian 20 Oct 17, 2022
Technical Analysis library in pandas for backtesting algotrading and quantitative analysis

bta-lib - A pandas based Technical Analysis Library bta-lib is pandas based technical analysis library and part of the backtrader family. Links Main P

DRo 393 Dec 20, 2022
Automatic Number Plate Recognition using Contours and Convolution Neural Networks (CNN)

Cite our paper if you find this project useful https://www.ijariit.com/manuscripts/v7i4/V7I4-1139.pdf Abstract Image processing technology is used in

Adithya M 2 Jun 28, 2022
MODNet: Trimap-Free Portrait Matting in Real Time

MODNet is a model for real-time portrait matting with only RGB image input.

Zhanghan Ke 2.8k Dec 30, 2022