Codes for coreference-aware machine reading comprehension

Last update: Sep 29, 2022

Related tags

Overview

Data and code for the paper "Tracing Origins: Coreference-aware Machine Reading Comprehension" at ACL2022.

Dataset

There are three folders for our three models mentioned in the paper: Coref_additive_spacy for Coref_additive_attention, Coref_dgl_spacy for GNN and Coref_multiplication_spacy for Coref_multiplication_attention, and each contains the train data set and the dev data set under the quoref folder.

each sample contains

context: the paragraph text
context_id: the unique identifier of the context
qas: a group of questions
question: question text
id: the unique identifier of the question
answers: a group of the answers to one question
text: answer text
answer_start: the start_position of one answer

Models

If you want to use our trained model, please download it from Google drive

Training

python run_quoref.py --train_file "quoref/train.json" --predict_file "quoref/dev.json" --model_type "roberta_multi" --model_name_or_path "roberta-large" --output_dir "out" --do_train --do_eval --eval_all_checkpoints --learning_rate 1e-5 --num_train_epochs 6 --overwrite_output_dir --per_gpu_train_batch_size 4 --save_steps 6000 --coref_weight 0.4

Kindly Hint

There is an open issue regarding the compatibility between NeuralCoref and spaCy 3.0. If you intend to use the latest spaCy models, please watch the issue.

Cite

If you extend or use this work, please cite the paper where it was introduced:

@article{Huang2021TracingOC,
  title={Tracing Origins: Coref-aware Machine Reading Comprehension},
  author={Baorong Huang and Zhuosheng Zhang and Hai Zhao},
  journal={ArXiv},
  year={2021},
  volume={abs/2110.07961}
}

Codes for coreference-aware machine reading comprehension

Related tags

Overview

Dataset

Models

Training

Kindly Hint

Cite

Owner

Tools and data for measuring the popularity & growth of various programming languages.

Course project of [email protected]

Toward Model Interpretability in Medical NLP

Finally decent dictionaries based on Wiktionary for your beloved eBook reader.

[KBS] Aspect-based sentiment analysis via affective knowledge enhanced graph convolutional networks

Torchrecipes provides a set of reproduci-able, re-usable, ready-to-run RECIPES for training different types of models, across multiple domains, on PyTorch Lightning.

Findings of ACL 2021

BiNE: Bipartite Network Embedding

RoNER is a Named Entity Recognition model based on a pre-trained BERT transformer model trained on RONECv2

CPC-big and k-means clustering for zero-resource speech processing

AMUSE - financial summarization

code for modular summarization work published in ACL2021 by Krishna et al

Code for Emergent Translation in Multi-Agent Communication

An open-source NLP library: fast text cleaning and preprocessing.

A framework for implementing federated learning

用Resnet101+GPT搭建一个玩王者荣耀的AI

Korean extractive summarization. 2021 AI 텍스트 요약 온라인 해커톤 화성갈끄니까팀 코드

Wikipedia-Utils: Preprocessing Wikipedia Texts for NLP

Chinese real time voice cloning (VC) and Chinese text to speech (TTS).

[AAAI 21] Curriculum Labeling: Revisiting Pseudo-Labeling for Semi-Supervised Learning