public repo for ESTER dataset and modeling (EMNLP'21)

Related tags

Deep LearningESTER
Overview

Project / Paper Introduction

This is the project repo for our EMNLP'21 paper: https://arxiv.org/abs/2104.08350

Here, we provide brief descriptions of the final data and detailed instructions to reproduce results in our paper. For more details, please refer to the paper.

Data

Final data used for the experiments are saved in ./data/ folder with train/dev/test splits. Most data fields are straightforward. Just a few notes,

  • question_event: this field is not provided by annotators nor used for our experiments. We simply use some heuristic rules based on POS tags to extract possible events in the questions. Users are encourages to try alternative tools such semantic role labeling.
  • original_events and indices are the annotator-provided event triggers plus their indices in the context.
  • answer_texts and answer_indices (in train and dev) are the annotator-provided answers plus their indices in the context.

Please Note: the evaluation script below (II) only works for the dev set. Please refer to Section III for submission to our leaderboard: https://eventqa.github.io

Models

I. Install packages.

We list the packages in our environment in env.yml file for your reference. Below are a few key packages.

  • python=3.8.5
  • pytorch=1.6.0
  • transformers=3.1.0
  • cudatoolkit=10.1.243
  • apex=0.1

To install apex, you can either follow official instruction: https://github.com/NVIDIA/apex or conda: https://anaconda.org/conda-forge/nvidia-apex

II. Replicate results in our paper.

1. Download trained models.

For reproduction purpose, we release all trained models.

  • Download link: https://drive.google.com/drive/folders/1bTCb4gBUCaNrw2chleD4RD9JP1_DOWjj?usp=sharing.
  • We only provide models with the best "hyper-parameters", and each comes with three random seeds: 5, 7, 23.
  • Make several directories to save models ./output/, ./output/facebook/ and ./output/allenai/.
  • For BART models, download them into ./output/facebook/.
  • For UnifiedQA models, download them into ./output/allenai/.
  • All other models can be saved in ./output/ directly. These ensure evaluation scripts run properly below.

2. Zero-shot performances in Table 3.

Run bash ./code/eval_zero_shot.sh. Model options are provided in the script.

3. Generative QA Fine-tuning performances in Table 3.

Run bash ./code/eval_ans_gen.sh. Make sure the following arguments are set correctly in the script.

  • Model Options provided in the script
  • Set suffix=""
  • Set lrs and batch according to model options. You can find these numbers in Appendix G of the paper.

4. Figure 6: UnifiedQA-large model trained with sub-samples.

Run bash ./code/eval_ans_gen.sh`. Make sure the following arguments are set correctly in the script.

  • model="allenai/unifiedqa-t5-large"
  • suffix={"_500" | "_1000" | "_2000" | "_3000" | "_4000"}
  • Set lrs and batch accordingly. You can find these information in the folder name containing the trained model objects.

5. Table 4: 500 original annotations v.s. completed

  • bash ./code/eval_ans_gen.sh with model="allenai/unifiedqa-t5-large and suffix="_500original
  • bash ./code/eval_ans_gen.sh with model="allenai/unifiedqa-t5-large and suffix="_500completed
  • Set lrs and batch accordingly again.

6. Extractive QA Fine-tuning performances in Table 3.

Simply run bash ./code/eval_span_pred.sh as it is.

7. Figure 8: Extractive QA Fine-tuning performances by changing positive weights.

  • Run bash ./code/eval_span_pred.sh.
  • Set pw, lrs and batch according to model folder names again.

III. Submission to ESTER Leaderboard

  • Set model_dir to your target models
  • Run leaderboard.sh, which outputs pred_dev.json and pred_test.json under ./output
  • If you write your own code to output predictions, make sure they follow our original sample order.
  • Email pred_test.json to us following in the format specified here: https://eventqa.github.io Sample outputs (using one of our UnifiedQA-large models) are provided under ./output

IV. Model Training

We also provide the model training scripts below.

1. Generative QA: Fine-tuning in Table 3.

  • Run bash ./code/run_ans_generation.sh.
  • Model options and hyper-parameter search range are provided in the script.
  • We use --fp16 argument to activate apex for GPU memory efficient training except for UnifiedQA-t5-large (trained on A100 GPU).

2. Figure 6: UnifiedQA-large model trained with sub-samples.

  • Run bash ./code/run_ans_gen_subsample.sh.
  • Set sample_size variable accordingly in the script.

3. Table 4: 500 original annotations v.s. completed

  • Run bash ./code/run_ans_gen.sh with model="allenai/unifiedqa-t5-large and suffix="_500original
  • Run bash ./code/run_ans_gen.sh with model="allenai/unifiedqa-t5-large and suffix="_500completed

4. Extractive QA Fine-tuning in Table 3 + Figure 8

Simply run bash ./code/run_span_pred.sh as it is.

Owner
PlusLab
Peng's Language Understanding & Synthesis Lab at UCLA and USC
PlusLab
Multi-Agent Reinforcement Learning (MARL) method to learn scalable control polices for multi-agent target tracking.

scalableMARL Scalable Reinforcement Learning Policies for Multi-Agent Control CD. Hsu, H. Jeong, GJ. Pappas, P. Chaudhari. "Scalable Reinforcement Lea

Christopher Hsu 17 Nov 17, 2022
A Python framework for conversational search

Chatty Goose Multi-stage Conversational Passage Retrieval: An Approach to Fusing Term Importance Estimation and Neural Query Rewriting Installation Ma

Castorini 36 Oct 23, 2022
Generate high quality pictures. GAN. Generative Adversarial Networks

ESRGAN generate high quality pictures. GAN. Generative Adversarial Networks """ Super-resolution of CelebA using Generative Adversarial Networks. The

Lieon 1 Dec 14, 2021
TransVTSpotter: End-to-end Video Text Spotter with Transformer

TransVTSpotter: End-to-end Video Text Spotter with Transformer Introduction A Multilingual, Open World Video Text Dataset and End-to-end Video Text Sp

weijiawu 66 Dec 26, 2022
Codes for TS-CAM: Token Semantic Coupled Attention Map for Weakly Supervised Object Localization.

TS-CAM: Token Semantic Coupled Attention Map for Weakly SupervisedObject Localization This is the official implementaion of paper TS-CAM: Token Semant

vasgaowei 112 Jan 02, 2023
[AAAI-2022] Official implementations of MCL: Mutual Contrastive Learning for Visual Representation Learning

Mutual Contrastive Learning for Visual Representation Learning This project provides source code for our Mutual Contrastive Learning for Visual Repres

winycg 48 Jan 02, 2023
A Python package for performing pore network modeling of porous media

Overview of OpenPNM OpenPNM is a comprehensive framework for performing pore network simulations of porous materials. More Information For more detail

PMEAL 336 Dec 30, 2022
Semantic Scholar's Author Disambiguation Algorithm & Evaluation Suite

S2AND This repository provides access to the S2AND dataset and S2AND reference model described in the paper S2AND: A Benchmark and Evaluation System f

AI2 54 Nov 28, 2022
State of the Art Neural Networks for Deep Learning

pyradox This python library helps you with implementing various state of the art neural networks in a totally customizable fashion using Tensorflow 2

Ritvik Rastogi 60 May 29, 2022
Road Crack Detection Using Deep Learning Methods

Road-Crack-Detection-Using-Deep-Learning-Methods This is my Diploma Thesis ¨Road Crack Detection Using Deep Learning Methods¨ under the supervision of

Aggelos Katsaliros 3 May 03, 2022
YOLOv5 detection interface - PyQt5 implementation

所有代码已上传,直接clone后,运行yolo_win.py即可开启界面。 2021/9/29:加入置信度选择 界面是在ultralytics的yolov5基础上建立的,界面使用pyqt5实现,内容较简单,娱乐而已。 功能: 模型选择 本地文件选择(视频图片均可) 开关摄像头

487 Dec 27, 2022
Implementation of RegretNet with Pytorch

Dependencies are Python 3, a recent PyTorch, numpy/scipy, tqdm, future and tensorboard. Plotting with Matplotlib. Implementation of the neural network

Horris zhGu 1 Nov 05, 2021
(ICCV'21) Official PyTorch implementation of Relational Embedding for Few-Shot Classification

Relational Embedding for Few-Shot Classification (ICCV 2021) Dahyun Kang, Heeseung Kwon, Juhong Min, Minsu Cho [paper], [project hompage] We propose t

Dahyun Kang 82 Dec 24, 2022
This is a Pytorch implementation of paper: DropEdge: Towards Deep Graph Convolutional Networks on Node Classification

DropEdge: Towards Deep Graph Convolutional Networks on Node Classification This is a Pytorch implementation of paper: DropEdge: Towards Deep Graph Con

401 Dec 16, 2022
SLIDE : In Defense of Smart Algorithms over Hardware Acceleration for Large-Scale Deep Learning Systems

The SLIDE package contains the source code for reproducing the main experiments in this paper. Dataset The Datasets can be downloaded in Amazon-

Intel Labs 72 Dec 16, 2022
External Attention Network

Beyond Self-attention: External Attention using Two Linear Layers for Visual Tasks paper : https://arxiv.org/abs/2105.02358 Jittor code will come soon

MenghaoGuo 357 Dec 11, 2022
An imperfect information game is a type of game with asymmetric information

DecisionHoldem An imperfect information game is a type of game with asymmetric information. Compared with perfect information game, imperfect informat

Decision AI 25 Dec 23, 2022
LSSY量化交易系统

LSSY量化交易系统 该项目是本人3年来研究量化慢慢积累开发的一套系统,属于早期作品慢慢修改而来,仅供学习研究,回测分析,实盘交易部分未公开

55 Oct 04, 2022
TyXe: Pyro-based BNNs for Pytorch users

TyXe: Pyro-based BNNs for Pytorch users TyXe aims to simplify the process of turning Pytorch neural networks into Bayesian neural networks by leveragi

87 Jan 03, 2023
A task-agnostic vision-language architecture as a step towards General Purpose Vision

Towards General Purpose Vision Systems By Tanmay Gupta, Amita Kamath, Aniruddha Kembhavi, and Derek Hoiem Overview Welcome to the official code base f

AI2 79 Dec 23, 2022