PERIN is Permutation-Invariant Semantic Parser developed for MRP 2020

Last update: Jan 04, 2023

Overview

PERIN: Permutation-invariant Semantic Parsing

David Samuel & Milan Straka

Charles University
Faculty of Mathematics and Physics
Institute of Formal and Applied Linguistics

Paper
Pretrained models
Interactive demo on Google Colab

PERIN is a universal sentence-to-graph neural network architecture modeling semantic representation from input sequences.

The main characteristics of our approach are:

Permutation-invariant model: PERIN is, to our best knowledge, the first graph-based semantic parser that predicts all nodes at once in parallel and trains them with a permutation-invariant loss function.
Relative encoding: We present a substantial improvement of relative encoding of node labels, which allows the use of a richer set of encoding rules.
Universal architecture: Our work presents a general sentence-to-graph pipeline adaptable for specific frameworks only by adjusting pre-processing and post-processing steps.

Our model was ranked among the two winning systems in both the cross-framework and the cross-lingual tracks of MRP 2020 and significantly advanced the accuracy of semantic parsing from the last year's MRP 2019.

This repository provides the official PyTorch implementation of our paper "ÚFAL at MRP 2020: Permutation-invariant Semantic Parsing in PERIN" together with pretrained base models for all five frameworks from MRP 2020: AMR, DRG, EDS, PTG and UCCA.

How to run

🐾 Clone repository and install the Python requirements

git clone https://github.com/ufal/perin.git
cd perin

pip3 install -r requirements.txt 
pip3 install git+https://github.com/cfmrp/mtool.git#egg=mtool

🐾 Download and pre-process the dataset

Download the treebanks into ${data_dir} and split the cross-lingual datasets into training and validation parts by running:

./scripts/split_dataset.sh "path_to_a_dataset.mrp"

Preprocess and cache the dataset (computing the relative encodings can take up to several hours):

python3 preprocess.py --config config/base_amr.yaml --data_directory ${data_dir}

You should also download CzEngVallex if you are going to parse PTG:

curl -O https://lindat.mff.cuni.cz/repository/xmlui/bitstream/handle/11234/1-1512/czengvallex.zip
unzip czengvallex.zip
rm frames_pairs.xml czengvallex.zip

🐾 Train

To train a shared model for the English and Chinese AMR, run the following script. Other configurations are located in the config folder.

python3 train.py --config config/base_amr.yaml --data_directory ${data_dir} --save_checkpoints --log_wandb

Note that the companion file in needed only to provide the lemmatized forms, so it's also possible to train without it (but that will most likely negatively influence the accuracy of label prediction) -- just set the companion paths to None.

🐾 Inference

You can run the inference on the validation and test datasets by running:

python3 inference.py --checkpoint "path_to_pretrained_model.h5" --data_directory ${data_dir}

Citation

@inproceedings{Sam:Str:20,
  author = {Samuel, David and Straka, Milan},
  title = {{{\'U}FAL} at {MRP}~2020:
           {P}ermutation-Invariant Semantic Parsing in {PERIN}},
  booktitle = CONLL:20:U,
  address = L:CONLL:20,
  pages = {\pages{--}{53}{64}},
  year = 2020
}

PERIN is Permutation-Invariant Semantic Parser developed for MRP 2020

Related tags

Overview

PERIN: Permutation-invariant Semantic Parsing

How to run

🐾 Clone repository and install the Python requirements

🐾 Download and pre-process the dataset

🐾 Train

🐾 Inference

Citation

Owner

ÚFAL

Official Pytorch implementation of "Beyond Static Features for Temporally Consistent 3D Human Pose and Shape from a Video", CVPR 2021

DrQ-v2: Improved Data-Augmented Reinforcement Learning

[CVPR'21] Projecting Your View Attentively: Monocular Road Scene Layout Estimation via Cross-view Transformation

No-reference Image Quality Assessment(NIQA) Algorithms (BRISQUE, NIQE, PIQE, RankIQA, MetaIQA)

SSD: A Unified Framework for Self-Supervised Outlier Detection [ICLR 2021]

Python 3 module to print out long strings of text with intervals of time inbetween

Automatic number plate recognition using tech: Yolo, OCR, Scene text detection, scene text recognation, flask, torch

A 3D Dense mapping backend library of SLAM based on taichi-Lang designed for the aerial swarm.

Implementation of Memformer, a Memory-augmented Transformer, in Pytorch

One Million Scenes for Autonomous Driving

[NeurIPS 2021] Source code for the paper "Qu-ANTI-zation: Exploiting Neural Network Quantization for Achieving Adversarial Outcomes"

[NeurIPS 2021] A weak-shot object detection approach by transferring semantic similarity and mask prior.

FLSim a flexible, standalone library written in PyTorch that simulates FL settings with a minimal, easy-to-use API

A collection of implementations of deep domain adaptation algorithms

Multilingual Image Captioning

ObsPy: A Python Toolbox for seismology/seismological observatories.

This repository contains code and data for "On the Multimodal Person Verification Using Audio-Visual-Thermal Data"

Trainable PyTorch reproduction of AlphaFold 2

Neurolab is a simple and powerful Neural Network Library for Python

Implementation for "Domain-Specific Bias Filtering for Single Labeled Domain Generalization"