AMRBART
An implementation for ACL2022 paper "Graph Pre-training for AMR Parsing and Generation". You may find our paper here (Arxiv).
Requirements
- python 3.8
- pytorch 1.8
- transformers 4.8.2
- pytorch-lightning 1.5.0
- Tesla V100 or A100
We recommend to use conda to manage virtual environments:
conda env update --name <env> --file requirements.yml
We also provide a docker image here.
Data Processing
You may download the AMR corpora at LDC.
We follow Spring to preprocess AMR graphs:
# 1. install spring
cd spring && pip install -e .
# 2. processing data
bash run-preprocess.sh
Pre-training
bash run-posttrain-bart-textinf-joint-denoising-6task-large-unified-V100.sh /path/to/BART/
Fine-tuning
For AMR Parsing, run
bash finetune_AMRbart_amrparsing.sh /path/to/pre-trained/AMRBART/ gpu_id
For AMR-to-text Generation, run
bash finetune_AMRbart_amr2text.sh /path/to/pre-trained/AMRBART/ gpu_id
Evaluation
For AMR Parsing, run
bash eval_AMRbart_amrparsing.sh /path/to/fine-tuned/AMRBART/ gpu_id
For AMR-to-text Generation, run
bash eval_AMRbart_amr2text.sh /path/to/fine-tuned/AMRBART/ gpu_id
Inference on your own data
If you want to run our code on your own data, try to transform your data into the format here, then run
For AMR Parsing, run
bash inference_amr.sh /path/to/fine-tuned/AMRBART/ gpu_id
For AMR-to-text Generation, run
bash inference_text.sh /path/to/fine-tuned/AMRBART/ gpu_id
Pre-trained Models
Pre-trained AMRBART
Setting | Params | checkpoint |
---|---|---|
AMRBART-base | 142M | model |
AMRBART-large | 409M | model |
Fine-tuned models on AMR-to-Text Generation
Setting | BLEU(tok) | BLEU(detok) | checkpoint | output |
---|---|---|---|---|
AMRBART-large (AMR2.0) | 49.8 | 45.7 | model | output |
AMRBART-large (AMR3.0) | 49.2 | 45.0 | model | output |
To get the tokenized bleu score, you need to use the scorer we provide here. We use this script in order to ensure comparability with previous approaches.
Fine-tuned models on AMR Parsing
Setting | Smatch | checkpoint | output |
---|---|---|---|
AMRBART-large (AMR2.0) | 85.4 | model | output |
AMRBART-large (AMR3.0) | 84.2 | model | output |
Todo
- clean code
References
@inproceedings{bai-etal-2022-graph,
title = "Graph Pre-training for {AMR} Parsing and Generation",
author = "Bai, Xuefeng and
Chen, Yulong and
Zhang, Yue",
booktitle = "Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
month = may,
year = "2022",
address = "Online",
publisher = "Association for Computational Linguistics",
url = "todo",
doi = "todo",
pages = "todo"
}