The repository for the paper: Multilingual Translation via Grafting Pre-trained Language Models

Last update: Dec 14, 2022

Related tags

Overview

Graformer

The repository for the paper: Multilingual Translation via Grafting Pre-trained Language Models

Graformer (also named BridgeTransformer in the code) is a sequence-to-sequence model mainly for Neural Machine Translation. We improve the multilingual translation by taking advantage of pre-trained (masked) language models, including pre-trained encoder (BERT) and pre-trained decoder (GPT). The code is based on Fairseq.

Examples

You can start with run/run.sh, with some minor modification. The corresponding scripts represent:

train a pre-trained BERT:
    run_arnold_multilingual_masked_lm_6e6d.sh

train a pre-trained GPT:
    run_arnold_multilingual_lm_6e6d.sh

train a Graformer:
    run_arnold_multilingual_graft_transformer_12e12d_ted.sh

inference from Graformer:
    run_arnold_multilingual_graft_inference_ted.sh

Released Models

We release our pre-trained mBERT and mGPT, along with the trained Graformer model in here.

Tensorflow Version

We will provide the tensorflow version in Neurst, a popular toolkit for sequence processing.

Citation

Please cite as:

@inproceedings{sun2021mulilingual,
    title = "Multilingual Translation via Grafting Pre-trained Language Models",
    author = "Sun, Zewei and Wang, Mingxuan and Li, Lei",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2021",
    year = "2021"
}

Contact

If you have any questions, please feel free to contact me: [email protected]

The repository for the paper: Multilingual Translation via Grafting Pre-trained Language Models

Related tags

Overview

Graformer

Examples

Released Models

Tensorflow Version

Citation

Contact

Owner

MILES is a multilingual text simplifier inspired by LSBert - A BERT-based lexical simplification approach proposed in 2018. Unlike LSBert, MILES uses the bert-base-multilingual-uncased model, as well as simple language-agnostic approaches to complex word identification (CWI) and candidate ranking.

Retraining OpenAI's GPT-2 on Discord Chats

Entity Disambiguation as text extraction (ACL 2022)

Topic Modelling for Humans

CrossNER: Evaluating Cross-Domain Named Entity Recognition (AAAI-2021)

Awesome-NLP-Research (ANLP)

Binaural Speech Synthesis

Contains descriptions and code of the mini-projects developed in various programming languages

This is a really simple text-to-speech app made with python and tkinter.

RecipeReduce: Simplified Recipe Processing for Lazy Programmers

📜 GPT-2 Rhyming Limerick and Haiku models using data augmentation

Code for Findings at EMNLP 2021 paper: "Learn Continually, Generalize Rapidly: Lifelong Knowledge Accumulation for Few-shot Learning"

Global Rhythm Style Transfer Without Text Transcriptions

spaCy plugin for Transformers , Udify, ELmo, etc.

Easy to use, state-of-the-art Neural Machine Translation for 100+ languages

A script that automatically creates a branch name using google translation api and jira api

:hot_pepper: R²SQL: "Dynamic Hybrid Relation Network for Cross-Domain Context-Dependent Semantic Parsing." (AAAI 2021)

Lumped-element impedance calculator and frequency-domain plotter.

Japanese synonym library

CCKS-Title-based-large-scale-commodity-entity-retrieval-top1