Code for Generating Disentangled Arguments with Prompts: A Simple Event Extraction Framework that Works

Last update: Oct 29, 2022

Overview

GDAP

Code for Generating Disentangled Arguments with Prompts: A Simple Event Extraction Framework that Works

Environment

Python (verified: v3.8)
CUDA (verified: v11.1)
Packages (see requirements.txt)

Usage

Preprocessing

We follow dygiepp for data preprocessing.

text2et: Event Type Detection
ettext2tri: Trigger Extraction
etrttext2role: Argument Extraction

# data processed by dyieapp
data/text2target/dyiepp_ace1005_ettext2tri_subtype
├── event.schema 
├── test.json
├── train.json
└── val.json

# data processed by  data_convert.convert_text_to_target
data/text2target/dyiepp_ace1005_ettext2tri_subtype
├── event.schema
├── test.json
├── train.json
└── val.json

Useful commands:

python -m data_convert.convert_text_to_target # data/raw_data -> data/text2target
python convert_dyiepp_to_sentence.py data/raw_data/dyiepp_ace2005 # doc -> sentence, used in evaluation

Training

Relevant scripts:

run_seq2seq.py: Python code entry, modified from the transformers/examples/seq2seq/run_seq2seq.py
run_seq2seq_span.bash: Model training script logging to the log file.

Example (see the above two files for more details):

# ace05 event type detection t5-base, the metric_format use eval_trigger-F1 
bash run_seq2seq_span.bash --data=dyiepp_ace2005_text2et_subtype --model=t5-base --format=et --metric_format=eval_trigger-F1

# ace05 tri extraction t5-base
bash run_seq2seq_span.bash --data=dyiepp_ace2005_ettext2tri_subtype --model=t5-base --format=tri --metric_format=eval_trigger-F1

# ace05 argument extraction t5-base
bash run_seq2seq_span.bash --data=dyiepp_ace2005_etrttext2role_subtype --model=t5-base --format=role --metric_format=eval_role-F1

Trained models are saved in the models/ folder.

Evaluation

run_tri_predict.bash: trigger extraction evaluation and inference script.
run_arg_predict.bash: argument extraction evaluation and inference script.

Todo

We aim to expand the codebase for a wider range of tasks, including

Name Entity Recognition
Keyword Generation
Event Relation Identification

If you find this repo helpful...

Please give us a ⭐ and cite our paper as

@misc{si2021-GDAP,
      title={Generating Disentangled Arguments with Prompts: A Simple Event Extraction Framework that Works}, 
      author={Jinghui Si and Xutan Peng and Chen Li and Haotian Xu and Jianxin Li},
      year={2021},
      eprint={2110.04525},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

This project borrows code from Text2Event

Code for Generating Disentangled Arguments with Prompts: A Simple Event Extraction Framework that Works

Related tags

Overview

GDAP

Environment

Usage

Preprocessing

Training

Evaluation

Todo

If you find this repo helpful...

Owner

Code from Daniel Lemire, A Better Alternative to Piecewise Linear Time Series Segmentation

Powerful and efficient Computer Vision Annotation Tool (CVAT)

A GUI for Face Recognition, based upon Docker, Tkinter, GPU and a camera device.

Official implementation of "Open-set Label Noise Can Improve Robustness Against Inherent Label Noise" (NeurIPS 2021)

Unofficial implementation of the ImageNet, CIFAR 10 and SVHN Augmentation Policies learned by AutoAugment using pillow

Unsupervised MRI Reconstruction via Zero-Shot Learned Adversarial Transformers

Exploring the Dual-task Correlation for Pose Guided Person Image Generation

Open source repository for the code accompanying the paper 'Non-Rigid Neural Radiance Fields Reconstruction and Novel View Synthesis of a Deforming Scene from Monocular Video'.

MQBench Quantization Aware Training with PyTorch

Learning Representations that Support Robust Transfer of Predictors

Neural Turing Machines (NTM) - PyTorch Implementation

Code for ViTAS_Vision Transformer Architecture Search

Text-to-Music Retrieval using Pre-defined/Data-driven Emotion Embeddings

HNN: Human (Hollywood) Neural Network

Code for "NeRS: Neural Reflectance Surfaces for Sparse-View 3D Reconstruction in the Wild," in NeurIPS 2021

Edison AT is software Depression Assistant personal.

Official implementation of "SinIR: Efficient General Image Manipulation with Single Image Reconstruction" (ICML 2021)

Deep Learning Emotion decoding using EEG data from Autism individuals

An interpreter for RASP as described in the ICML 2021 paper "Thinking Like Transformers"

CoTr: Efficiently Bridging CNN and Transformer for 3D Medical Image Segmentation