Fine-grained Post-training for Improving Retrieval-based Dialogue Systems - NAACL 2021

Related tags

Deep LearningBERT_FP
Overview

Fine-grained Post-training for Multi-turn Response Selection

PWC

Implements the model described in the following paper Fine-grained Post-training for Improving Retrieval-based Dialogue Systems in NAACL-2021.

@inproceedings{han-etal-2021-fine,
title = "Fine-grained Post-training for Improving Retrieval-based Dialogue Systems",
author = "Han, Janghoon  and Hong, Taesuk  and Kim, Byoungjae  and Ko, Youngjoong  and Seo, Jungyun",
booktitle = "Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies",
month = jun, year = "2021", address = "Online", publisher = "Association for Computational Linguistics", url = "https://www.aclweb.org/anthology/2021.naacl-main.122", pages = "1549--1558",
}

This code is reimplemented as a fork of huggingface/transformers.

alt text

Setup and Dependencies

This code is implemented using PyTorch v1.8.0, and provides out of the box support with CUDA 11.2 Anaconda is the recommended to set up this codebase.

# https://pytorch.org
conda install pytorch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0 cudatoolkit=11.1 -c pytorch -c conda-forge
pip install -r requirements.txt

Preparing Data and Checkpoints

Post-trained and fine-tuned Checkpoints

We provide following post-trained and fine-tuned checkpoints.

Data pkl for Fine-tuning (Response Selection)

We used the following data for post-training and fine-tuning

Original version for each dataset is availble in Ubuntu Corpus V1, Douban Corpus, and E-Commerce Corpus, respectively.

Fine-grained Post-Training

Making Data for post-training and fine-tuning
Data_processing.py

Post-training Examples

(Ubuntu Corpus V1, Douban Corpus, E-commerce Corpus)
python -u FPT/ubuntu_final.py --num_train_epochs 25
python -u FPT/douban_final.py --num_train_epochs 27
python -u FPT/e_commmerce_final.py --num_train_epochs 34

Fine-tuning Examples

(Ubuntu Corpus V1, Douban Corpus, E-commerce Corpus)
Taining
To train the model, set `--is_training`
python -u Fine-Tuning/Response_selection.py --task ubuntu --is_training
python -u Fine-Tuning/Response_selection.py --task douban --is_training
python -u Fine-Tuning/Response_selection.py --task e_commerce --is_training
Testing
python -u Fine-Tuning/Response_selection.py --task ubuntu
python -u Fine-Tuning/Response_selection.py --task douban 
python -u Fine-Tuning/Response_selection.py --task e_commerce

Training Response Selection Models

Model Arguments

Fine-grained post-training
task_name data_dir checkpoint_path
ubuntu ubuntu_data/ubuntu_post_train.pkl FPT/PT_checkpoint/ubuntu/bert.pt
douban douban_data/douban_post_train.pkl FPT/PT_checkpoint/douban/bert.pt
e-commerce e_commerce_data/e_commerce_post_train.pkl FPT/PT_checkpoint/e_commerce/bert.pt
Fine-tuning
task_name data_dir checkpoint_path
ubuntu ubuntu_data/ubuntu_dataset_1M.pkl Fine-Tuning/FT_checkpoint/ubuntu.0.pt
douban douban_data/douban_dataset_1M.pkl Fine-Tuning/FT_checkpoint/douban.0.pt
e-commerce e_commerce_data/e_commerce_dataset_1M.pkl Fine-Tuning/FT_checkpoint/e_commerce.0.pt

Performance

We provide model checkpoints of BERT_FP, which obtained new state-of-the-art, for each dataset.

Ubuntu [email protected] [email protected] [email protected]
[BERT_FP] 0.911 0.962 0.994
Douban MAP MRR [email protected] [email protected] [email protected] [email protected]
[BERT_FP] 0.644 0.680 0.512 0.324 0.542 0.870
E-Commerce [email protected] [email protected] [email protected]
[BERT_FP] 0.870 0.956 0.993
Owner
Janghoon Han
NLP Researcher
Janghoon Han
The official PyTorch implementation of Curriculum by Smoothing (NeurIPS 2020, Spotlight).

Curriculum by Smoothing (NeurIPS 2020) The official PyTorch implementation of Curriculum by Smoothing (NeurIPS 2020, Spotlight). For any questions reg

PAIR Lab 36 Nov 23, 2022
Official Pytorch implementation of Scene Representation Networks: Continuous 3D-Structure-Aware Neural Scene Representations

Scene Representation Networks This is the official implementation of the NeurIPS submission "Scene Representation Networks: Continuous 3D-Structure-Aw

Vincent Sitzmann 365 Jan 06, 2023
The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate.

The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate. Website • Key Features • How To Use • Docs •

Pytorch Lightning 21.1k Dec 29, 2022
PyTorch implementation of Tacotron speech synthesis model.

tacotron_pytorch PyTorch implementation of Tacotron speech synthesis model. Inspired from keithito/tacotron. Currently not as much good speech quality

Ryuichi Yamamoto 279 Dec 09, 2022
Contextualized Perturbation for Textual Adversarial Attack, NAACL 2021

Contextualized Perturbation for Textual Adversarial Attack Introduction This is a PyTorch implementation of Contextualized Perturbation for Textual Ad

cookielee77 30 Jan 01, 2023
Manim is an engine for precise programmatic animations, designed for creating explanatory math videos

Manim is an engine for precise programmatic animations, designed for creating explanatory math videos. Note, there are two versions of manim. This rep

Grant Sanderson 49k Jan 09, 2023
PyTorch implementation of MuseMorphose, a Transformer-based model for music style transfer.

MuseMorphose This repository contains the official implementation of the following paper: Shih-Lun Wu, Yi-Hsuan Yang MuseMorphose: Full-Song and Fine-

Yating Music, Taiwan AI Labs 142 Jan 08, 2023
Tzer: TVM Implementation of "Coverage-Guided Tensor Compiler Fuzzing with Joint IR-Pass Mutation (OOPSLA'22)“.

Artifact • Reproduce Bugs • Quick Start • Installation • Extend Tzer Coverage-Guided Tensor Compiler Fuzzing with Joint IR-Pass Mutation This is the s

12 Dec 29, 2022
CausaLM: Causal Model Explanation Through Counterfactual Language Models

CausaLM: Causal Model Explanation Through Counterfactual Language Models Authors: Amir Feder, Nadav Oved, Uri Shalit, Roi Reichart Abstract: Understan

Amir Feder 39 Jul 10, 2022
(3DV 2021 Oral) Filtering by Cluster Consistency for Large-Scale Multi-Image Matching

Scalable Cluster-Consistency Statistics for Robust Multi-Object Matching (3DV 2021 Oral Presentation) Filtering by Cluster Consistency (FCC) is a very

Yunpeng Shi 11 Sep 28, 2022
Research code for Arxiv paper "Camera Motion Agnostic 3D Human Pose Estimation"

GMR(Camera Motion Agnostic 3D Human Pose Estimation) This repo provides the source code of our arXiv paper: Seong Hyun Kim, Sunwon Jeong, Sungbum Park

Seong Hyun Kim 1 Feb 07, 2022
Code for the CVPR 2021 paper "Triple-cooperative Video Shadow Detection"

Triple-cooperative Video Shadow Detection Code and dataset for the CVPR 2021 paper "Triple-cooperative Video Shadow Detection"[arXiv link] [official l

Zhihao Chen 24 Oct 04, 2022
A sequence of Jupyter notebooks featuring the 12 Steps to Navier-Stokes

CFD Python Please cite as: Barba, Lorena A., and Forsyth, Gilbert F. (2018). CFD Python: the 12 steps to Navier-Stokes equations. Journal of Open Sour

Barba group 2.6k Dec 30, 2022
Collections for the lasted paper about multi-view clustering methods (papers, codes)

Multi-View Clustering Papers Collections for the lasted paper about multi-view clustering methods (papers, codes). There also exists some repositories

Andrew Guan 10 Sep 20, 2022
RoMa: A lightweight library to deal with 3D rotations in PyTorch.

RoMa: A lightweight library to deal with 3D rotations in PyTorch. RoMa (which stands for Rotation Manipulation) provides differentiable mappings betwe

NAVER 90 Dec 27, 2022
ML course - EPFL Machine Learning Course, Fall 2021

EPFL Machine Learning Course CS-433 Machine Learning Course, Fall 2021 Repository for all lecture notes, labs and projects - resources, code templates

EPFL Machine Learning and Optimization Laboratory 1k Jan 04, 2023
PolyGlot, a fuzzing framework for language processors

PolyGlot, a fuzzing framework for language processors Build We tested PolyGlot on Ubuntu 18.04. Get the source code: git clone https://github.com/s3te

Software Systems Security Team at Penn State University 79 Dec 27, 2022
Pytorch implementation of Learning Rate Dropout.

Learning-Rate-Dropout Pytorch implementation of Learning Rate Dropout. Paper Link: https://arxiv.org/pdf/1912.00144.pdf Train ResNet-34 for Cifar10: r

42 Nov 25, 2022
Robocop is your personal mini voice assistant made using Python.

Robocop-VoiceAssistant To use this project, you should have python installed in your system. If you don't have python installed, install it beforehand

Sohil Khanduja 3 Feb 26, 2022
Deepparse is a state-of-the-art library for parsing multinational street addresses using deep learning

Here is deepparse. Deepparse is a state-of-the-art library for parsing multinational street addresses using deep learning. Use deepparse to Use the pr

GRAAL/GRAIL 192 Dec 20, 2022