Changing the Mind of Transformers for Topically-Controllable Language Generation

Overview

Changing the Mind of Transformers for Topically-Controllable Language Generation

We will first introduce the how to run the IPython notebook demo by downloading our pretrained models. Then, we will introduce how to run our training and evaluation code.

Image of our model

Requirements and Setup

  • An Unix like OS with at least one GPU
  • To set up the python environment, run pip install -r requirements.txt. I use python 3.7 and pytorch 1.3.1, but I think other python 3 or pytorch > 1.0 versions might also be fine or just require very simple revision of the code. Our codes also use IPython notebook (for running the interactive demo), Spacy (for tokenization), nltk (for running evaluation and pplm), and gensim (for running the LDA baseline).
  • If your python path is not ~/anaconda3/bin/python, change your PY_PATH in the all the scripts in ./bin

Running IPython Notebook Demo

  • Download the pretrained models and dictionary file from here or following the instructions for training code below
  • Use IPython notebook to open ./src/evaluation/test_conditional_LM.ipynb
  • Run the 1st block after putting the models into the corresponding directory or revising the paths of TOPIC_MODEL_DIR, GENERATION_MODEL_DIR, DICT_FILE in the first block.
  • Modify the input context prompt in the 2nd block and run the block to see the generated topics
  • Choose some topics or specify some words and run the 3rd block to see the generated continuations that start with conditional x:. We will also generate the continuation without the condition that start with original x: as a baseline. The topical words that appear in the continuation will be highlighted.
  • You can append a genearted continuation to the 2nd block and repeat the process

Preprocessing Wikipedia for Training and Evaluation

  • First, download only the text from Wikipedia into json format using WikiExtractor
  • Check the path in ./bin/preprocessing_single_proc.sh and run the script. In the preprocessing, we will run Spacy tokenizer and GPT2 tokenizer, heuristically align their resulting tokens, split the corpus into training/validation/testing sets, and store the word indices into tensors.
  • Note that ./bin/preprocessing_single_proc.sh might be slow because it does not parallelize the tokenization processes. If you use job scheduler like slurm in your server, you might want to see the parallized scripts for tokenization in ./bin/old/tokenize_all_wiki_gpt2.sh and ./bin/old/tokenize_all_wiki.sh

Running Training

  • Prepare a word embedding file (e.g., we download the GloVe embedding from here)
  • Train our option generator using ./bin/train_option_generator.sh
  • Train our conditional text generator using ./bin/train_conditional_generator.sh (could train option generator and text generator at the same time)
  • You can start from original GPT2 model or start from our pretrained models. In our paper, we use learning rate = 1e-4. You can also try other values between 1e-4 and 1e-5.

Running Evaluation using Automatic Metrics

  • To evaluate/visualize conditional text generator, update the GENERATION_MODEL_DIR and TOPIC_MODEL_DIR using the model path from the previous step to run ./bin/train_conditional_generator.sh.
  • To evaluate/visualize option generator, update the GENERATION_MODEL_DIR and TOPIC_MODEL_DIR and run ./bin/eval_option_generator.sh. Set VISUALIZATION='Y' to visualize the topics given some randomly selected prompt. Set AUTO_EVAL_TOPICS='Y' to compare the quality of topics from different methods as we did in Table 1 in our EACL paper. Set AUTO_EVAL_GENRATION='Y' to evaluate the topics by the quality of text that is generated given these topics as we did in Table 6 in our paper appendix.
  • Our scores are stored at the end of each OUT_FILE file when AUTO_EVAL*='Y'. Our text generator is called "model condition", and our option generator is called NSD_topic in our code, where NSD stands for neural set decoder.
  • In our code, we also evaluate some globally clustering baselines such as LDA and kmeans. In order to test them, you can train a LDA model by following the steps here. You can also see an example code at ./src/preprocessing/tools/train_LDA_model.py. For kmeans clustering, we use ./src/preprocessing/tools/word_emb_global_clustering.py. If you do not want to test them, just remove LDA_org and global_centers from METHOD_LIST

Running Evaluation using Amazon Mechanical Turk

  • Download STSb dataset from here
  • Preprocessing STS using ./src/evaluation/filter_STS_for_GPT2.py and remove the duplication by sort sts-train_longer.csv | uniq > sts-train_longer_uniq.csv
  • Set OUTPUT_CSV_FOR_MTURK='Y' in ./bin/train_conditional_generator.sh and ./bin/eval_option_generator.sh to generate CSV files for MTurk tasks.
  • Our crowdsourcing templates and responses from workers could be found in ./MTurk_eval

Citation

If you use the code in a publication, please cite our paper.

Haw-Shiuan Chang, Jiaming Yuan, Mohit Iyyer, and Andrew McCallum,
“Changing the Mind of Transformers for Topically-Controllable Language Generation.” 
Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2021
Owner
IESL
IESL
Vrcwatch - Supply the local time to VRChat as Avatar Parameters through OSC

English: README-EN.md VRCWatch VRCWatch は、VRChat 内のアバター向けに現在時刻を送信するためのプログラムです。 使

Kosaki Mezumona 17 Nov 30, 2022
TensorFlow (Python API) implementation of Neural Style

neural-style-tf This is a TensorFlow implementation of several techniques described in the papers: Image Style Transfer Using Convolutional Neural Net

Cameron 3.1k Jan 02, 2023
Accompanying code for the paper "A Kernel Test for Causal Association via Noise Contrastive Backdoor Adjustment".

#backdoor-HSIC (bd_HSIC) Accompanying code for the paper "A Kernel Test for Causal Association via Noise Contrastive Backdoor Adjustment". To generate

Robert Hu 0 Nov 25, 2021
Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models

Molecular Sets (MOSES): A benchmarking platform for molecular generation models Deep generative models are rapidly becoming popular for the discovery

MOSES 656 Dec 29, 2022
Implementation of "Large Steps in Inverse Rendering of Geometry"

Large Steps in Inverse Rendering of Geometry ACM Transactions on Graphics (Proceedings of SIGGRAPH Asia), December 2021. Baptiste Nicolet · Alec Jacob

RGL: Realistic Graphics Lab 274 Jan 06, 2023
Repo for my Tensorflow/Keras CV experiments. Mostly revolving around the Danbooru20xx dataset

SW-CV-ModelZoo Repo for my Tensorflow/Keras CV experiments. Mostly revolving around the Danbooru20xx dataset Framework: TF/Keras 2.7 Training SQLite D

20 Dec 27, 2022
The datasets and code of ACL 2021 paper "Aspect-Category-Opinion-Sentiment Quadruple Extraction with Implicit Aspects and Opinions".

Aspect-Category-Opinion-Sentiment (ACOS) Quadruple Extraction This repo contains the data sets and source code of our paper: Aspect-Category-Opinion-S

NUSTM 144 Jan 02, 2023
This is a Python Module For Encryption, Hashing And Other stuff

EnroCrypt This is a Python Module For Encryption, Hashing And Other Basic Stuff You Need, With Secure Encryption And Strong Salted Hashing You Can Do

5 Sep 15, 2022
Non-stationary GP package written from scratch in PyTorch

NSGP-Torch Examples gpytorch model with skgpytorch # Import packages import torch from regdata import NonStat2D from gpytorch.kernels import RBFKernel

Zeel B Patel 1 Mar 06, 2022
Codes for ACL-IJCNLP 2021 Paper "Zero-shot Fact Verification by Claim Generation"

Zero-shot-Fact-Verification-by-Claim-Generation This repository contains code and models for the paper: Zero-shot Fact Verification by Claim Generatio

Liangming Pan 47 Jan 01, 2023
Open source implementation of "A Self-Supervised Descriptor for Image Copy Detection" (SSCD).

A Self-Supervised Descriptor for Image Copy Detection (SSCD) This is the open-source codebase for "A Self-Supervised Descriptor for Image Copy Detecti

Meta Research 68 Jan 04, 2023
Image Restoration Using Swin Transformer for VapourSynth

SwinIR SwinIR function for VapourSynth, based on https://github.com/JingyunLiang/SwinIR. Dependencies NumPy PyTorch, preferably with CUDA. Note that t

Holy Wu 11 Jun 19, 2022
This is the code of using DQN to play Sekiro .

Update for using DQN to play sekiro 2021.2.2(English Version) This is the code of using DQN to play Sekiro . I am very glad to tell that I have writen

144 Dec 25, 2022
MAT: Mask-Aware Transformer for Large Hole Image Inpainting

MAT: Mask-Aware Transformer for Large Hole Image Inpainting (CVPR2022, Oral) Wenbo Li, Zhe Lin, Kun Zhou, Lu Qi, Yi Wang, Jiaya Jia [Paper] News This

254 Dec 29, 2022
Official code base for the poster "On the use of Cortical Magnification and Saccades as Biological Proxies for Data Augmentation" published in NeurIPS 2021 Workshop (SVRHM)

Self-Supervised Learning (SimCLR) with Biological Plausible Image Augmentations Official code base for the poster "On the use of Cortical Magnificatio

Binxu 8 Aug 17, 2022
This is an official implementation for "PlaneRecNet".

PlaneRecNet This is an official implementation for PlaneRecNet: A multi-task convolutional neural network provides instance segmentation for piece-wis

yaxu 50 Nov 17, 2022
Here is the diagnostic tool for BMVC 2021 paper Diagnosing Errors in Video Relation Detectors.

Here is the diagnostic tool for BMVC 2021 paper Diagnosing Errors in Video Relation Detectors. We provide a tiny ground truth file demo_gt.json, and t

Shuo Chen 3 Dec 26, 2022
Free Book about Deep-Learning approaches for Chess (like AlphaZero, Leela Chess Zero and Stockfish NNUE)

Free Book about Deep-Learning approaches for Chess (like AlphaZero, Leela Chess Zero and Stockfish NNUE)

Dominik Klein 189 Dec 21, 2022
Face recognition project by matching the features extracted using SIFT.

MV_FaceDetectionWithSIFT Face recognition project by matching the features extracted using SIFT. By : Aria Radmehr Professor : Ali Amiri Dependencies

Aria Radmehr 4 May 31, 2022
Tensorflow implementation of Human-Level Control through Deep Reinforcement Learning

Human-Level Control through Deep Reinforcement Learning Tensorflow implementation of Human-Level Control through Deep Reinforcement Learning. This imp

Devsisters Corp. 2.4k Dec 26, 2022