Source code for AAAI20 "Generating Persona Consistent Dialogues by Exploiting Natural Language Inference".

Overview

Generating Persona Consistent Dialogues by Exploiting Natural Language Inference

Source code for RCDG model in AAAI20 Generating Persona Consistent Dialogues by Exploiting Natural Language Inference, a natural language inference (NLI) enhanced reinforcement learning dialogue model.

Requirements:

The code is tested under the following env:

  • Python 3.6
  • Pytorch 0.3.1

Install with conda: conda install pytorch==0.3.1 torchvision cudatoolkit=7.5 -c pytorch

This released code has been tested on a Titan-XP 12G GPU.

Data

We have provided some data samples in ./data to show the format. For downloading the full datasets, please refer to the following papers:

How to Run:

For a easier way to run the code, here the NLI model is GRU+MLP, i.e. RCDG_base, and we remove the time-consuming MC search.

Here are a few steps to run this code:

0. Prepare Data

python preprocess.py -train_src data/src-train.txt -train_tgt data/tgt-train.txt -train_per data/per-train.txt -valid_src data/src-val.txt -valid_tgt data/tgt-val.txt -valid_per data/per-val.txt -train_nli data/nli-train.txt -valid_nli data/nli-valid.txt -save_data data/nli_persona -src_vocab_size 18300 -tgt_vocab_size 18300 -share_vocab

And as introduced in the paper, there are different training stages:

1. NLI model Pretrain

cd NLI_pretrain/

python train.py -data ../data/nli_persona -batch_size 32 -save_model saved_model/consistent_dialogue -rnn_size 500 -word_vec_size 300 -dropout 0.2 -epochs 5 -learning_rate_decay 1 -gpu 0

And you should see something like:

Loading train dataset from ../data/nli_persona.train.1.pt, number of examples: 1
31432
Epoch  1, nli_step     1/ 4108; nli: 0.28125
Epoch  1, nli_step    11/ 4108; nli: 0.38125
Epoch  1, nli_step    21/ 4108; nli: 0.43438
Epoch  1, nli_step    31/ 4108; nli: 0.48125
Epoch  1, nli_step    41/ 4108; nli: 0.53750
Epoch  1, nli_step    51/ 4108; nli: 0.56250
Epoch  1, nli_step    61/ 4108; nli: 0.49062
...

2. Generator G Pretrain

cd ../G_pretrain/

python train.py -data ../data/nli_persona -batch_size 32 -rnn_size 500 -word_vec_size 300  -dropout 0.2 -epochs 15 -g_optim adam -g_learning_rate 1e-3 -learning_rate_decay 1 -train_from PATH_TO_PRETRAINED_NLI -gpu 0

Here the PATH_TO_PRETRAINED_NLI should be replaced by your model path, e.g., ../NLI_pretrain/saved_model/consistent_dialogue_e3.pt.

If , you should see the ppl comes down during training, which means the dialogue model is in training:

Loading train dataset from ../data/nli_persona.train.1.pt, number of examples: 131432
Epoch  4, teacher_force     1/ 4108; acc:   0.00; ppl: 18619.76; 125 src tok/s; 162 tgt tok/s;      3 s elapsed
Epoch  4, teacher_force    11/ 4108; acc:   9.69; ppl: 2816.01; 4159 src tok/s; 5468 tgt tok/s;      3 s elapsed
Epoch  4, teacher_force    21/ 4108; acc:   9.78; ppl: 550.46; 5532 src tok/s; 6116 tgt tok/s;      4 s elapsed
Epoch  4, teacher_force    31/ 4108; acc:  11.15; ppl: 383.06; 5810 src tok/s; 6263 tgt tok/s;      5 s elapsed
...
Epoch  4, teacher_force   941/ 4108; acc:  25.40; ppl:  90.18; 5993 src tok/s; 6645 tgt tok/s;     63 s elapsed
Epoch  4, teacher_force   951/ 4108; acc:  27.49; ppl:  77.07; 5861 src tok/s; 6479 tgt tok/s;     64 s elapsed
Epoch  4, teacher_force   961/ 4108; acc:  26.24; ppl:  83.17; 5473 src tok/s; 6443 tgt tok/s;     64 s elapsed
Epoch  4, teacher_force   971/ 4108; acc:  24.33; ppl:  97.14; 5614 src tok/s; 6685 tgt tok/s;     65 s elapsed
...

3. Discriminator D Pretrain

cd ../D_pretrain/

python train.py -epochs 20 -d_optim adam -d_learning_rate 1e-4 -data ../data/nli_persona -train_from PATH_TO_PRETRAINED_G -batch_size 32 -learning_rate_decay 0.99 -gpu 0

Similarly, replace PATH_TO_PRETRAINED_G with the G Pretrain model path.

The acc of D will be displayed during training:

Loading train dataset from ../data/nli_persona.train.1.pt, number of examples: 131432
Epoch  5, d_step     1/ 4108; d: 0.49587
Epoch  5, d_step    11/ 4108; d: 0.51580
Epoch  5, d_step    21/ 4108; d: 0.49853
Epoch  5, d_step    31/ 4108; d: 0.55248
Epoch  5, d_step    41/ 4108; d: 0.55168
...

4. Reinforcement Training

cd ../reinforcement_train/

python train.py -epochs 30 -batch_size 32 -d_learning_rate 1e-4 -g_learning_rate 1e-4 -learning_rate_decay 0.9 -data ../data/nli_persona -train_from PATH_TO_PRETRAINED_D -gpu 0

Remember to replace PATH_TO_PRETRAINED_D with the D Pretrain model path.

Note that all the -epochs are global among all stages, if you want to tune this parameter. Actually, there are 30 - 20 = 10 training epochs in this Reinforcement Training stage if the D Pretrain model was trained 20 epochs in total.

Loading train dataset from ../data/nli_persona.train.1.pt, number of examples: 131432
Epoch  7, self_sample     1/ 4108; acc:   2.12; ppl:   0.28; 298 src tok/s; 234 tgt tok/s;      2 s elapsed
Epoch  7, teacher_force    11/ 4108; acc:   3.32; ppl:   0.53; 2519 src tok/s; 2772 tgt tok/s;      3 s elapsed
Epoch  7, d_step    21/ 4108; d: 0.98896
Epoch  7, d_step    31/ 4108; d: 0.99906
Epoch  7, self_sample    41/ 4108; acc:   0.00; ppl:   0.27; 1769 src tok/s; 260 tgt tok/s;      7 s elapsed
Epoch  7, teacher_force    51/ 4108; acc:   2.83; ppl:   0.43; 2368 src tok/s; 2910 tgt tok/s;      9 s elapsed
Epoch  7, d_step    61/ 4108; d: 0.75311
Epoch  7, d_step    71/ 4108; d: 0.83919
Epoch  7, self_sample    81/ 4108; acc:   6.20; ppl:   0.33; 1791 src tok/s; 232 tgt tok/s;     12 s elapsed
...

5. Testing Trained Model

Now we have a trained dialogue model, we can test by:

Still in ./reinforcement_train/

python predict.py -model TRAINED_MODEL_PATH  -src ../data/src-val.txt -tgt ../data/tgt-val.txt -replace_unk -verbose -output ./results.txt -per ../data/per-val.txt -nli nli-val.txt -gpu 0

MISC

  • Initializing Model Seems Slow?

    This is a legacy problem due to pytorch < 0.4, not brought by this project. And the training efficiency will not be affected.

  • BibTex

     @article{Song_RCDG_2020,
     	title={Generating Persona Consistent Dialogues by Exploiting Natural Language Inference},
     	volume={34},
     	DOI={10.1609/aaai.v34i05.6417},
     	number={05},
     	journal={Proceedings of the AAAI Conference on Artificial Intelligence},
     	author={Song, Haoyu and Zhang, Wei-Nan and Hu, Jingwen and Liu, Ting},
     	year={2020},
     	month={Apr.},
     	pages={8878-8885}
     	}
    
Question answering app is used to answer for a user given question from user given text.

Question answering app is used to answer for a user given question from user given text.It is created using HuggingFace's transformer pipeline and streamlit python packages.

Siva Prakash 3 Apr 05, 2022
中文生成式预训练模型

T5 PEGASUS 中文生成式预训练模型,以mT5为基础架构和初始权重,通过类似PEGASUS的方式进行预训练。 详情可见:https://kexue.fm/archives/8209 Tokenizer 我们将T5 PEGASUS的Tokenizer换成了BERT的Tokenizer,它对中文更

410 Jan 03, 2023
Nateve compiler developed with python.

Adam Adam is a Nateve Programming Language compiler developed using Python. Nateve Nateve is a new general domain programming language open source ins

Nateve 7 Jan 15, 2022
The ability of computer software to identify words and phrases in spoken language and convert them to human-readable text

speech-recognition-py Speech recognition is the ability of computer software to identify words and phrases in spoken language and convert them to huma

Deepangshi 1 Apr 03, 2022
Persian Bert For Long-Range Sequences

ParsBigBird: Persian Bert For Long-Range Sequences The Bert and ParsBert algorithms can handle texts with token lengths of up to 512, however, many ta

Sajjad Ayoubi 63 Dec 14, 2022
Chinese NER(Named Entity Recognition) using BERT(Softmax, CRF, Span)

Chinese NER(Named Entity Recognition) using BERT(Softmax, CRF, Span)

Weitang Liu 1.6k Jan 03, 2023
Spokestack is a library that allows a user to easily incorporate a voice interface into any Python application with a focus on embedded systems.

Welcome to Spokestack Python! This library is intended for developing voice interfaces in Python. This can include anything from Raspberry Pi applicat

Spokestack 133 Sep 20, 2022
Mednlp - Medical natural language parsing and utility library

Medical natural language parsing and utility library A natural language medical

Paul Landes 3 Aug 24, 2022
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

English | 简体中文 | 繁體中文 | 한국어 State-of-the-art Machine Learning for JAX, PyTorch and TensorFlow 🤗 Transformers provides thousands of pretrained models

Hugging Face 77.1k Dec 31, 2022
Code for Findings of ACL 2022 Paper "Sentiment Word Aware Multimodal Refinement for Multimodal Sentiment Analysis with ASR Errors"

SWRM Code for Findings of ACL 2022 Paper "Sentiment Word Aware Multimodal Refinement for Multimodal Sentiment Analysis with ASR Errors" Clone Clone th

14 Jan 03, 2023
This repository contains the code for "Generating Datasets with Pretrained Language Models".

Datasets from Instructions (DINO 🦕 ) This repository contains the code for Generating Datasets with Pretrained Language Models. The paper introduces

Timo Schick 154 Jan 01, 2023
Neural text generators like the GPT models promise a general-purpose means of manipulating texts.

Boolean Prompting for Neural Text Generators Neural text generators like the GPT models promise a general-purpose means of manipulating texts. These m

Jeffrey M. Binder 20 Jan 09, 2023
A PyTorch-based model pruning toolkit for pre-trained language models

English | 中文说明 TextPruner是一个为预训练语言模型设计的模型裁剪工具包,通过轻量、快速的裁剪方法对模型进行结构化剪枝,从而实现压缩模型体积、提升模型速度。 其他相关资源: 知识蒸馏工具TextBrewer:https://github.com/airaria/TextBrewe

Ziqing Yang 231 Jan 08, 2023
Lightweight utility tools for the detection of multiple spellings, meanings, and language-specific terminology in British and American English

Breame ( British English and American English) Breame is a lightweight Python package with a number of utility tools to aid in the detection of words

Charles 8 Oct 10, 2022
Club chatbot

Chatbot Club chatbot Instructions to get the Chatterbot working Step 1. First make sure you are using a version of Python 3 or newer. To check your ve

5 Mar 07, 2022
Deploying a Text Summarization NLP use case on Docker Container Utilizing Nvidia GPU

GPU Docker NLP Application Deployment Deploying a Text Summarization NLP use case on Docker Container Utilizing Nvidia GPU, to setup the enviroment on

Ritesh Yadav 9 Oct 14, 2022
This codebase facilitates fast experimentation of differentially private training of Hugging Face transformers.

private-transformers This codebase facilitates fast experimentation of differentially private training of Hugging Face transformers. What is this? Why

Xuechen Li 73 Dec 28, 2022
Disfl-QA: A Benchmark Dataset for Understanding Disfluencies in Question Answering

Disfl-QA is a targeted dataset for contextual disfluencies in an information seeking setting, namely question answering over Wikipedia passages. Disfl-QA builds upon the SQuAD-v2 (Rajpurkar et al., 2

Google Research Datasets 52 Jun 21, 2022
source code for paper: WhiteningBERT: An Easy Unsupervised Sentence Embedding Approach.

WhiteningBERT Source code and data for paper WhiteningBERT: An Easy Unsupervised Sentence Embedding Approach. Preparation git clone https://github.com

49 Dec 17, 2022
Spooky Skelly For Python

_____ _ _____ _ _ _ | __| ___ ___ ___ | |_ _ _ | __|| |_ ___ | || | _ _ |__ || . || . || . || '

Kur0R1uka 1 Dec 23, 2021