PyTorch Implementation for AAAI'21 "Do Response Selection Models Really Know What's Next? Utterance Manipulation Strategies for Multi-turn Response Selection"

Overview

UMS for Multi-turn Response Selection

PWC

Implements the model described in the following paper Do Response Selection Models Really Know What's Next? Utterance Manipulation Strategies for Multi-turn Response Selection.

@inproceedings{whang2021ums,
  title={Do Response Selection Models Really Know What's Next? Utterance Manipulation Strategies for Multi-turn Response Selection},
  author={Whang, Taesun and Lee, Dongyub and Oh, Dongsuk and Lee, Chanhee and Han, Kijong and Lee, Dong-hun and Lee, Saebyeok},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  year={2021}
}

This code is reimplemented as a fork of huggingface/transformers and taesunwhang/BERT-ResSel.

alt text

Setup and Dependencies

This code is implemented using PyTorch v1.6.0, and provides out of the box support with CUDA 10.1 and CuDNN 7.6.5.

Anaconda / Miniconda is the recommended to set up this codebase.

Anaconda or Miniconda

Clone this repository and create an environment:

git clone https://www.github.com/taesunwhang/UMS-ResSel
conda create -n ums_ressel python=3.7

# activate the environment and install all dependencies
conda activate ums_ressel
cd UMS-ResSel

# https://pytorch.org
pip install torch==1.6.0+cu101 -f https://download.pytorch.org/whl/torch_stable.html
pip install -r requirements.txt

Preparing Data and Checkpoints

Pre- and Post-trained Checkpoints

We provide following pre- and post-trained checkpoints.

sh scripts/download_pretrained_checkpoints.sh

Data pkls for Fine-tuning (Response Selection)

Original version for each dataset is availble in Ubuntu Corpus V1, Douban Corpus, and E-Commerce Corpus, respectively.

sh scripts/download_datasets.sh

Domain-specific Post-Training

Post-training Creation

Data for post-training BERT
#Ubuntu Corpus V1
sh scripts/create_bert_post_data_creation_ubuntu.sh
#Douban Corpus
sh scripts/create_bert_post_data_creation_douban.sh
#E-commerce Corpus
sh scripts/create_bert_post_data_creation_e-commerce.sh
Data for post-training ELECTRA
sh scripts/download_electra_post_training_pkl.sh

Post-training Examples

BERT+ (e.g., Ubuntu Corpus V1)
python3 main.py --model bert_post_training --task_name ubuntu --data_dir data/ubuntu_corpus_v1 --bert_pretrained bert-base-uncased --bert_checkpoint_path bert-base-uncased-pytorch_model.bin --task_type response_selection --gpu_ids "0" --root_dir /path/to/root_dir --training_type post_training
ELECTRA+ (e.g., Douban Corpus)
python3 main.py --model electra_post_training --task_name douban --data_dir data/electra_post_training --bert_pretrained electra-base-chinese --bert_checkpoint_path electra-base-chinese-pytorch_model.bin --task_type response_selection --gpu_ids "0" --root_dir /path/to/root_dir --training_type post_training

Training Response Selection Models

Model Arguments

BERT-Base
task_name data_dir bert_pretrained bert_checkpoint_path
ubuntu data/ubuntu_corpus_v1 bert-base-uncased bert-base-uncased-pytorch_model.bin
douban
e-commerce
data/douban
data/e-commerce
bert-base-wwm-chinese bert-base-wwm-chinese_model.bin
BERT-Post
task_name data_dir bert_pretrained bert_checkpoint_path
ubuntu data/ubuntu_corpus_v1 bert-post-uncased bert-post-uncased-pytorch_model.pth
douban data/douban bert-post-douban bert-post-douban-pytorch_model.pth
e-commerce data/e-commerce bert-post-ecommerce bert-post-ecommerce-pytorch_model.pth
ELECTRA-Base
task_name data_dir bert_pretrained bert_checkpoint_path
ubuntu data/ubuntu_corpus_v1 electra-base electra-base-pytorch_model.bin
douban
e-commerce
data/douban
data/e-commerce
electra-base-chinese electra-base-chinese-pytorch_model.bin
ELECTRA-Post
task_name data_dir bert_pretrained bert_checkpoint_path
ubuntu data/ubuntu_corpus_v1 electra-post electra-post-pytorch_model.pth
douban data/douban electra-post-douban electra-post-douban-pytorch_model.pth
e-commerce data/e-commerce electra-post-ecommerce electra-post-ecommerce-pytorch_model.pth

Fine-tuning Examples

BERT+ (e.g., Ubuntu Corpus V1)
python3 main.py --model bert_post --task_name ubuntu --data_dir data/ubuntu_corpus_v1 --bert_pretrained bert-post-uncased --bert_checkpoint_path bert-post-uncased-pytorch_model.pth --task_type response_selection --gpu_ids "0" --root_dir /path/to/root_dir
UMS BERT+ (e.g., Douban Corpus)
python3 main.py --model bert_post --task_name douban --data_dir data/douban --bert_pretrained bert-post-douban --bert_checkpoint_path bert-post-douban-pytorch_model.pth --task_type response_selection --gpu_ids "0" --root_dir /path/to/root_dir --multi_task_type "ins,del,srch"
UMS ELECTRA (e.g., E-Commerce)
python3 main.py --model electra_base --task_name e-commerce --data_dir data/e-commerce --bert_pretrained electra-base-chinese --bert_checkpoint_path electra-base-chinese-pytorch_model.bin --task_type response_selection --gpu_ids "0" --root_dir /path/to/root_dir --multi_task_type "ins,del,srch"

Evaluation

To evaluate the model, set --evaluate to /path/to/checkpoints

UMS BERT+ (e.g., Ubuntu Corpus V1)
python3 main.py --model bert_post --task_name ubuntu --data_dir data/ubuntu_corpus_v1 --bert_pretrained bert-post-uncased --bert_checkpoint_path bert-post-uncased-pytorch_model.pth --task_type response_selection --gpu_ids "0" --root_dir /path/to/root_dir --evaluate /path/to/checkpoints --multi_task_type "ins,del,srch"

Performance

We provide model checkpoints of UMS-BERT+, which obtained new state-of-the-art, for each dataset.

Ubuntu [email protected] [email protected] [email protected]
UMS-BERT+ 0.875 0.942 0.988
Douban MAP MRR [email protected] [email protected] [email protected] [email protected]
UMS-BERT+ 0.625 0.664 0.499 0.318 0.482 0.858
E-Commerce [email protected] [email protected] [email protected]
UMS-BERT+ 0.762 0.905 0.986
Owner
Taesun Whang
Interested in NLP, Dialogue System, Multimodal Learning. Currently attending Master's course in Dept. of Computer Science and Engineering, Korea University.
Taesun Whang
Generating Images with Recurrent Adversarial Networks

Generating Images with Recurrent Adversarial Networks Python (Theano) implementation of Generating Images with Recurrent Adversarial Networks code pro

Daniel Jiwoong Im 121 Sep 08, 2022
Lucid Sonic Dreams syncs GAN-generated visuals to music.

Lucid Sonic Dreams Lucid Sonic Dreams syncs GAN-generated visuals to music. By default, it uses NVLabs StyleGAN2, with pre-trained models lifted from

731 Jan 02, 2023
Quickly and easily create / train a custom DeepDream model

Dream-Creator This project aims to simplify the process of creating a custom DeepDream model by using pretrained GoogleNet models and custom image dat

55 Dec 27, 2022
Source code for CAST - Crisis Domain Adaptation Using Sequence-to-sequence Transformers (Accepted to ISCRAM 2021, CorePaper).

Source code for CAST: Crisis Domain Adaptation UsingSequence-to-sequenceTransformers (Paper, BibTeX, Accepted to ISCRAM 2021, CorePaper) Quick start D

Congcong Wang 0 Jul 14, 2021
Official implementation of the paper "Topographic VAEs learn Equivariant Capsules"

Topographic Variational Autoencoder Paper: https://arxiv.org/abs/2109.01394 Getting Started Install requirements with Anaconda: conda env create -f en

T. Andy Keller 69 Dec 12, 2022
Implementation of "DeepOrder: Deep Learning for Test Case Prioritization in Continuous Integration Testing".

DeepOrder Implementation of DeepOrder for the paper "DeepOrder: Deep Learning for Test Case Prioritization in Continuous Integration Testing". Project

6 Nov 07, 2022
Implementation of trRosetta and trDesign for Pytorch, made into a convenient package

trRosetta - Pytorch (wip) Implementation of trRosetta and trDesign for Pytorch, made into a convenient package

Phil Wang 67 Dec 17, 2022
This repository contains a PyTorch implementation of "AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis".

AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis | Project Page | Paper | PyTorch implementation for the paper "AD-NeRF: Audio

551 Dec 29, 2022
Styleformer - Official Pytorch Implementation

Styleformer -- Official PyTorch implementation Styleformer: Transformer based Generative Adversarial Networks with Style Vector(https://arxiv.org/abs/

Jeeseung Park 159 Dec 12, 2022
maximal update parametrization (µP)

Maximal Update Parametrization (μP) and Hyperparameter Transfer (μTransfer) Paper link | Blog link In Tensor Programs V: Tuning Large Neural Networks

Microsoft 694 Jan 03, 2023
This project provides the proof of the uniqueness of the equilibrium and the global asymptotic stability.

Delayed-cellular-neural-network This project provides the proof of the uniqueness of the equilibrium and the global asymptotic stability. There is als

4 Apr 28, 2022
Lama-cleaner: Image inpainting tool powered by LaMa

Lama-cleaner: Image inpainting tool powered by LaMa

Qing 5.8k Jan 05, 2023
A no-BS, dead-simple training visualizer for tf-keras

A no-BS, dead-simple training visualizer for tf-keras TrainingDashboard Plot inter-epoch and intra-epoch loss and metrics within a jupyter notebook wi

Vibhu Agrawal 3 May 28, 2021
Repository for the paper "Online Domain Adaptation for Occupancy Mapping", RSS 2020

RSS 2020 - Online Domain Adaptation for Occupancy Mapping Repository for the paper "Online Domain Adaptation for Occupancy Mapping", Robotics: Science

Anthony 26 Sep 22, 2022
[CVPR 2022] Unsupervised Image-to-Image Translation with Generative Prior

GP-UNIT - Official PyTorch Implementation This repository provides the official PyTorch implementation for the following paper: Unsupervised Image-to-

Shuai Yang 125 Jan 03, 2023
RCT-ART is an NLP pipeline built with spaCy for converting clinical trial result sentences into tables through jointly extracting intervention, outcome and outcome measure entities and their relations.

Randomised controlled trial abstract result tabulator RCT-ART is an NLP pipeline built with spaCy for converting clinical trial result sentences into

2 Sep 16, 2022
Key information extraction from invoice document with Graph Convolution Network

Key Information Extraction from Scanned Invoices Key information extraction from invoice document with Graph Convolution Network Related blog post fro

Phan Hoang 39 Dec 16, 2022
Object detection evaluation metrics using Python.

Object detection evaluation metrics using Python.

Louis Facun 2 Sep 06, 2022
SIR model parameter estimation using a novel algorithm for differentiated uniformization.

TenSIR Parameter estimation on epidemic data under the SIR model using a novel algorithm for differentiated uniformization of Markov transition rate m

The Spang Lab 4 Nov 30, 2022