One implementation of the paper "DMRST: A Joint Framework for Document-Level Multilingual RST Discourse Segmentation and Parsing".

Last update: Dec 11, 2022

Related tags

Deep Learning DMRST_Parser

Overview

Introduction

One implementation of the paper "DMRST: A Joint Framework for Document-Level Multilingual RST Discourse Segmentation and Parsing".
Users can apply it to parse the input text from scratch, and get the EDU segmentations and the parsed tree structure.
The model supports both sentence-level and document-level RST discourse parsing.
This repo and the pre-trained model is only for research use.

Package Requirements

pytorch==1.7.1
transformers==4.8.2

Supported Languages

We trained and evaluated the model with the multilingual collection of RST discourse treebanks, and it natively supports 6 languages: English, Portuguese, Spanish, German, Dutch, Basque. Interested users can also try other languages.

Data Format

[Input] InputSentence: The input document/sentence, and the raw text will be tokenizaed and encoded by the xlm-roberta-base language backbone. '|| ' denotes the EDU boundary positions.
- Although the report, || which has released || before the stock market opened, || didn't trigger the 190.58 point drop in the Dow Jones Industrial Average, || analysts said || it did play a role in the market's decline. ||
[Output] EDU_Breaks: The indices of the EDU boundary tokens, including the last word of the sentence.
- [2, 5, 10, 22, 24, 33]
[Output] tree_parsing_output: The model outputs of the discourse parsing tree follow this format.
- (1:Satellite=Contrast:4,5:Nucleus=span:6) (1:Nucleus=Same-Unit:3,4:Nucleus=Same-Unite:4) (5:Satellite=Attribution:5,6:Nucleus=span:6) (1:Satellite=span:1,2:Nucleus=Elaboration:3) (2:Nucleus=span:2,3:Satellite=Temporal:3)

How to use it for parsing

Put the text paragraph to the file ./data/text_for_inference.txt.
Run the script MUL_main_Infer.py to obtain the RST parsing result. See the script for detailed model output.
We recommend users to run the parser on a GPU-equipped environment.

Citation

@article{liu2021dmrst,
  title={DMRST: A Joint Framework for Document-Level Multilingual RST Discourse Segmentation and Parsing},
  author={Liu, Zhengyuan and Shi, Ke and Chen, Nancy F},
  journal={arXiv preprint arXiv:2110.04518},
  year={2021}
}

@inproceedings{liu2020multilingual,
  title={Multilingual Neural RST Discourse Parsing},
  author={Liu, Zhengyuan and Shi, Ke and Chen, Nancy},
  booktitle={Proceedings of the 28th International Conference on Computational Linguistics},
  pages={6730--6738},
  year={2020}
}

One implementation of the paper "DMRST: A Joint Framework for Document-Level Multilingual RST Discourse Segmentation and Parsing".

Related tags

Overview

Introduction

Package Requirements

Supported Languages

Data Format

How to use it for parsing

Citation

Owner

seq-to-mind

Defocus Map Estimation and Deblurring from a Single Dual-Pixel Image

FedMM: Saddle Point Optimization for Federated Adversarial Domain Adaptation

EdiBERT is a generative model based on a bi-directional transformer, suited for image manipulation

TVNet: Temporal Voting Network for Action Localization

MVGCN: a novel multi-view graph convolutional network (MVGCN) framework for link prediction in biomedical bipartite networks.

Reproduces ResNet-V3 with pytorch

This repo holds the code of TransFuse: Fusing Transformers and CNNs for Medical Image Segmentation

This repository provides the official implementation of 'Learning to ignore: rethinking attention in CNNs' accepted in BMVC 2021.

RGB-stacking 🛑 🟩 🔷 for robotic manipulation

Large Scale Multi-Illuminant (LSMI) Dataset for Developing White Balance Algorithm under Mixed Illumination

Code for CVPR2021 paper "Robust Reflection Removal with Reflection-free Flash-only Cues"

Lazy, a tool for running things in idle time

Sematic-Segmantation - Semantic Segmentation on MIT ADE20K dataset in PyTorch

Codes for SIGIR'22 Paper 'On-Device Next-Item Recommendation with Self-Supervised Knowledge Distillation'

🐾 Semantic segmentation of paws from cute pet images (PyTorch)

The NEOSSat is a dual-mission microsatellite designed to detect potentially hazardous Earth-orbit-crossing asteroids and track objects that reside in deep space

A PyTorch library for Vision Transformers

Official repository for "On Generating Transferable Targeted Perturbations" (ICCV 2021)

CS550 Machine Learning course project on CNN Detection.

This repository contains all data used for writing a research paper Multiple Object Trackers in OpenCV: A Benchmark, presented in ISIE 2021 conference in Kyoto, Japan.