The code for two papers: Feedback Transformer and Expire-Span.

Last update: Dec 25, 2022

Related tags

Text Data & NLP transformer-sequential

Overview

transformer-sequential

This repo contains the code for two papers:

Feedback Transformer
Expire-Span

The training code is structured for long sequential modeling with Transformer-like architectures.

Requirements

You will need a CUDA-enabled GPU to run the code.

Setup

Run the following:

pip install -r requirements.txt

Feedback Transformer

Introduced in Addressing Some Limitations of Transformers with Feedback Memory.

Running Experiments from the Paper

enwik8

Model	Params	Valid	Test
Feedback Transformer	77M	0.984	0.962

Numbers are Bits-Per-Character

bash experiments/feedback/enwik8.sh

Algorithmic

Model	3 Variable	5 Variable
Transformer	33.7	37.5
Feedback Transformer	99.1	92.6

Numbers are % Accuracy on Test

bash experiments/feedback/algorithmic_3var.sh
bash experiments/feedback/algorithmic_5var.sh

Expire-Span

Introduced in Not All Memories are Created Equal: Learning to Expire.

Running Experiments from the Paper

enwik8

Model	Params	Valid	Test
Expire-Span 12L	38M	1.014	0.994

Numbers are Bits-Per-Character

bash experiments/expire_span/enwik8.sh

Object Collision

Model	Maximum Span	Test Error (%)
Expire-Span	16k	52.2
Expire-Span	32k	36.7
Expire-Span	64k	26.7

bash experiments/expire_span/object_collision_16k.sh
bash experiments/expire_span/object_collision_32k.sh
bash experiments/expire_span/object_collision_64k.sh

License

The code is licensed under CC-BY-NC license. See the LICENSE file for more details.

The code for two papers: Feedback Transformer and Expire-Span.

Related tags

Overview

transformer-sequential

Requirements

Setup

Feedback Transformer

Running Experiments from the Paper

enwik8

Algorithmic

Expire-Span

Running Experiments from the Paper

enwik8

Object Collision

License

Owner

Meta Research

COVID-19 Chatbot with Rasa 2.0: open source conversational AI

VADER Sentiment Analysis. VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon and rule-based sentiment analysis tool that is specifically attuned to sentiments expressed in social media, and works well on texts from other domains.

A Japanese tokenizer based on recurrent neural networks

An Analysis Toolkit for Natural Language Generation (Translation, Captioning, Summarization, etc.)

ByT5: Towards a token-free future with pre-trained byte-to-byte models

Chinese segmentation library

Code for the paper TestRank: Bringing Order into Unlabeled Test Instances for Deep Learning Tasks

뉴스 도메인 질의응답 시스템 (21-1학기 졸업 프로젝트)

Training and evaluation codes for the BertGen paper (ACL-IJCNLP 2021)

Twitter Sentiment Analysis using #tag, words and username

Idea is to build a model which will take keywords as inputs and generate sentences as outputs.

Dust model dichotomous performance analysis

Text editor on python to convert english text to malayalam(Romanization/Transiteration).

ConvBERT: Improving BERT with Span-based Dynamic Convolution

Phrase-BERT: Improved Phrase Embeddings from BERT with an Application to Corpus Exploration

Code for the paper "Are Sixteen Heads Really Better than One?"

A toolkit for document-level event extraction, containing some SOTA model implementations

🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy

Build Text Rerankers with Deep Language Models

天池中药说明书实体识别挑战冠军方案；中文命名实体识别；NER; BERT-CRF & BERT-SPAN & BERT-MRC；Pytorch