Does Pretraining for Summarization Reuqire Knowledge Transfer?

Overview

Does Pretraining for Summarization Reuqire Knowledge Transfer?

This repository is the official implementation of the work in the paper Does Pretraining for Summarization Reuqire Knowledge Transfer? to appear in Findings of EMNLP 2021.
You can find the paper on arXiv here: https://arxiv.org/abs/2109.04953

Requirements

This code requires Python 3 (tested using version 3.6)

To install requirements, run:

pip install -r requirements.txt

Preparing finetuning datasets

To prepare a summarization dataset for finetuning, run the corresponding script in the finetuning_datasetgen folder. For example, to prepare the cnn-dailymail dataset run:

cd finetuning_datasetgen
python cnndm.py

Running finetuning experiment

We show here how to run training, prediction and evaluation steps for a finetuning experiment. We assume that you have downloaded the pretrained models in the pretrained_models folder from the provided Google Drive link (see pretrained_models/README.md) If you want to pretrain models yourself, see latter part of this readme for the instructions.

All models in our work are trained using allennlp config files which are in .jsonnet format. To run a finetuning experiment, simply run

# for t5-like models
./pipeline_t5.sh 
   
    

# for pointer-generator models
./pipeline_pg.sh 
    

    
   

For example, for finetuning a T5 model on cnndailymail dataset, starting from a model pretrained with ourtasks-nonsense pretraining dataset, run

./pipeline_t5.sh finetuning_experiments/cnndm/t5-ourtasks-nonsense

Similarly, for finetuning a randomly-initialized pointer-generator model, run

./pipeline_pg.sh finetuning_experiments/cnndm/pg-randominit

The trained model and output files would be available in the folder that would be created by the script.

model.tar.gz contains the trained (finetuned) model

test_outputs.jsonl contains the outputs of the model on the test split.

test_genmetrics.json contains the ROUGE scores of the output

Creating pretraining datasets

We have provided the nonsense pretraining datasets used in our work via Google Drive (see dataset_root/pretraining_datasets/README.md for instructions)

However, if you want to generate your own pretraining corpus, you can run

cd pretraining_datasetgen
# for generating dataset using pretraining tasks
python ourtasks.py
# for generating dataset using STEP pretraining tasks
python steptasks.py

These commands would create pretraining datasets using nonsense. If you want to create datasets starting from wikipedia documents please look into the two scripts which guide you how to do that by commenting/uncommenting two blocks of code.

Pretraining models

Although we provide you the pretrained model checkpoints via GoogleDrive, if you want to pretrain your own models, you can do that by using the corresponding pretraining config file. As an example, we have provided a config file which pretrains on ourtasks-nonsense dataset. Make sure that the pretraining dataset files exist (either created by you or downloaded from GoogleDrive) before running the pretraining command. The pretraining is also done using the same shell scripts used for the finetuning experiments. For example, to pretrain a model on the ourtasks-nonsense dataset, simply run :

./pipeline_t5.sh pretraining_experiments/pretraining_t5_ourtasks_nonsense
Owner
Approximately Correct Machine Intelligence (ACMI) Lab
Research on machine learning, its social impacts, and applications to healthcare. PI—@zackchase
Approximately Correct Machine Intelligence (ACMI) Lab
Transformer - Transformer in PyTorch

Transformer 完成进度 Embeddings and PositionalEncoding with example. MultiHeadAttent

Tianyang Li 1 Jan 06, 2022
A scikit-learn-compatible module for estimating prediction intervals.

MAPIE - Model Agnostic Prediction Interval Estimator MAPIE allows you to easily estimate prediction intervals (or prediction sets) using your favourit

588 Jan 04, 2023
Boostcamp AI Tech 3rd / Basic Paper reading w.r.t Embedding

Boostcamp AI Tech 3rd : Basic Paper Reading w.r.t Embedding TL;DR 1992년부터 2018년도까지 이루어진 word/sentence embedding의 중요한 줄기를 이루는 기초 논문 스터디를 진행하고자 합니다. 논

Soyeon Kim 14 Nov 14, 2022
Repo for parser tensorflow(.pb) and tflite(.tflite)

tfmodel_parser .pb file is the format of tensorflow model .tflite file is the format of tflite model, which usually used in mobile devices before star

1 Dec 23, 2021
[ICCV'21] PlaneTR: Structure-Guided Transformers for 3D Plane Recovery

PlaneTR: Structure-Guided Transformers for 3D Plane Recovery This is the official implementation of our ICCV 2021 paper News There maybe some bugs in

73 Nov 30, 2022
MonoRCNN is a monocular 3D object detection method for automonous driving

MonoRCNN MonoRCNN is a monocular 3D object detection method for automonous driving, published at ICCV 2021. This project is an implementation of MonoR

87 Dec 27, 2022
Interpretable and Generalizable Person Re-Identification with Query-Adaptive Convolution and Temporal Lifting

QAConv Interpretable and Generalizable Person Re-Identification with Query-Adaptive Convolution and Temporal Lifting This PyTorch code is proposed in

Shengcai Liao 166 Dec 28, 2022
A spherical CNN for weather forecasting

DeepSphere-Weather - Deep Learning on the sphere for weather/climate applications. The code in this repository provides a scalable and flexible framew

DeepSphere 47 Dec 25, 2022
PyMove is a Python library to simplify queries and visualization of trajectories and other spatial-temporal data

Use PyMove and go much further Information Package Status License Python Version Platforms Build Status PyPi version PyPi Downloads Conda version Cond

Insight Data Science Lab 64 Nov 15, 2022
CVPR 2021 Official Pytorch Code for UC2: Universal Cross-lingual Cross-modal Vision-and-Language Pre-training

UC2 UC2: Universal Cross-lingual Cross-modal Vision-and-Language Pre-training Mingyang Zhou, Luowei Zhou, Shuohang Wang, Yu Cheng, Linjie Li, Zhou Yu,

Mingyang Zhou 28 Dec 30, 2022
A3C LSTM Atari with Pytorch plus A3G design

NEWLY ADDED A3G A NEW GPU/CPU ARCHITECTURE OF A3C FOR SUBSTANTIALLY ACCELERATED TRAINING!! RL A3C Pytorch NEWLY ADDED A3G!! New implementation of A3C

David Griffis 532 Jan 02, 2023
Improving 3D Object Detection with Channel-wise Transformer

"Improving 3D Object Detection with Channel-wise Transformer" Thanks for the OpenPCDet, this implementation of the CT3D is mainly based on the pcdet v

Hualian Sheng 107 Dec 20, 2022
Official implementation for the paper: "Multi-label Classification with Partial Annotations using Class-aware Selective Loss"

Multi-label Classification with Partial Annotations using Class-aware Selective Loss Paper | Pretrained models Official PyTorch Implementation Emanuel

99 Dec 27, 2022
https://sites.google.com/cornell.edu/recsys2021tutorial

Counterfactual Learning and Evaluation for Recommender Systems (RecSys'21 Tutorial) Materials for "Counterfactual Learning and Evaluation for Recommen

yuta-saito 45 Nov 10, 2022
Code for Blind Image Decomposition (BID) and Blind Image Decomposition network (BIDeN).

arXiv, porject page, paper Blind Image Decomposition (BID) Blind Image Decomposition is a novel task. The task requires separating a superimposed imag

64 Dec 20, 2022
Hierarchical Clustering: O(1)-Approximation for Well-Clustered Graphs

Hierarchical Clustering: O(1)-Approximation for Well-Clustered Graphs This repository contains code to accompany the paper "Hierarchical Clustering: O

3 Sep 25, 2022
Open & Efficient for Framework for Aspect-based Sentiment Analysis

PyABSA - Open & Efficient for Framework for Aspect-based Sentiment Analysis Fast & Low Memory requirement & Enhanced implementation of Local Context F

YangHeng 567 Jan 07, 2023
Convolutional neural network web app trained to track our infant’s sleep schedule using our Google Nest camera.

Machine Learning Sleep Schedule Tracker What is it? Convolutional neural network web app trained to track our infant’s sleep schedule using our Google

g-parki 7 Jul 15, 2022
Python module providing a framework to trace individual edges in an image using Gaussian process regression.

Edge Tracing using Gaussian Process Regression Repository storing python module which implements a framework to trace individual edges in an image usi

Jamie Burke 7 Dec 27, 2022
PyTorch implementation of Neural Dual Contouring.

NDC PyTorch implementation of Neural Dual Contouring. Citation We are still writing the paper while adding more improvements and applications. If you

Zhiqin Chen 140 Dec 26, 2022