Official code repository of the paper Linear Transformers Are Secretly Fast Weight Programmers.

Overview

Linear Transformers Are Secretly Fast Weight Programmers

This repository contains the code accompanying the paper Linear Transformers Are Secretly Fast Weight Programmers which is published at ICML'21. It also contains the logs of all synthetic experiments.

Synthetic Experiments

Requirements

$ cat req.txt 
jupyter==1.0.0
pandas==1.0.1
seaborn==0.10.0
torch==1.6.0
matplotlib==3.1.3
numpy==1.17.2
pip3 install -r req.txt

Rerun Experiments

Logs are provided in the synthetic/logs folder. The files in that folder are a result of running the following commands:

Setting 1 (capacity):

python3 main.py --begin=20 --end=600 --step=20 --attn_name=softmax --update_rule=sum
python3 main.py --begin=20 --end=600 --step=20 --attn_name=linear --update_rule=sum
python3 main.py --begin=20 --end=600 --step=20 --attn_name=dpfp --attn_arg=1 --update_rule=sum
python3 main.py --begin=20 --end=600 --step=20 --attn_name=dpfp --attn_arg=2 --update_rule=sum

python3 main.py --begin=20 --end=600 --step=20 --attn_name=dpfp --attn_arg=3 --update_rule=sum
python3 main.py --begin=20 --end=600 --step=20 --attn_name=favor --attn_arg=64 --update_rule=sum
python3 main.py --begin=20 --end=600 --step=20 --attn_name=favor --attn_arg=128 --update_rule=sum
python3 main.py --begin=20 --end=600 --step=20 --attn_name=favor --attn_arg=512 --update_rule=sum

Setting 2 (update rule):

python3 main.py --begin=20 --end=200 --step=20 --attn_name=dpfp --attn_arg=1 --update_rule=sum --replace
python3 main.py --begin=20 --end=200 --step=20 --attn_name=dpfp --attn_arg=1 --update_rule=ours --replace
python3 main.py --begin=20 --end=200 --step=20 --attn_name=tanh --update_rule=fwm --replace
python3 main.py --begin=20 --end=200 --step=20 --attn_name=dpfp --attn_arg=1 --update_rule=fwm --replace

python3 main.py --begin=20 --end=200 --step=20 --attn_name=dpfp --attn_arg=2 --update_rule=ours --replace
python3 main.py --begin=20 --end=200 --step=20 --attn_name=linear --update_rule=ours --replace
python3 main.py --begin=20 --end=200 --step=20 --attn_name=favor --attn_arg=64 --update_rule=ours --replace
python3 main.py --begin=20 --end=200 --step=20 --attn_name=favor --attn_arg=128 --update_rule=ours --replace

Generate figures from the logs using the following notebooks:

synthetic/setting1_generate_figure.ipynb
synthetic/setting2_generate_figure.ipynb

Language Modelling & Machine Translation

The toolkit and scripts for language modeling experiments can be found at IDSIA/lmtool-fwms.

For machine translation experiments, we ported the different attention functions implemented in the language modeling toolkit to the multi-head attention implementation in FAIRSEQ.

Citation

@inproceedings{schlag2021linear,
      title={Linear Transformers Are Secretly Fast Weight Programmers}, 
      author={Imanol Schlag and Kazuki Irie and J\"urgen Schmidhuber},
      booktitle={Proc. Int. Conf. on Machine Learning (ICML)},
      address = {Virtual only},
      month = jul,
      year={2021}
}
Owner
Imanol Schlag
Imanol Schlag
A Python wrapper for simple offline real-time dictation (speech-to-text) and speaker-recognition using Vosk.

Simple-Vosk A Python wrapper for simple offline real-time dictation (speech-to-text) and speaker-recognition using Vosk. Check out the official Vosk G

2 Jun 19, 2022
A Python/Pytorch app for easily synthesising human voices

Voice Cloning App A Python/Pytorch app for easily synthesising human voices Documentation Discord Server Video guide Voice Sharing Hub FAQ's System Re

Ben Andrew 840 Jan 04, 2023
Code from the paper "High-Performance Brain-to-Text Communication via Handwriting"

Code from the paper "High-Performance Brain-to-Text Communication via Handwriting"

Francis R. Willett 305 Dec 22, 2022
Simple Python library, distributed via binary wheels with few direct dependencies, for easily using wav2vec 2.0 models for speech recognition

Wav2Vec2 STT Python Beta Software Simple Python library, distributed via binary wheels with few direct dependencies, for easily using wav2vec 2.0 mode

David Zurow 22 Dec 29, 2022
To classify the News into Real/Fake using Features from the Text Content of the article

Hoax-Detector Authenticity of news has now become a major problem. The Idea is to classify the News into Real/Fake using Features from the Text Conten

Aravindhan 1 Feb 09, 2022
Turn clang-tidy warnings and fixes to comments in your pull request

clang-tidy pull request comments A GitHub Action to post clang-tidy warnings and suggestions as review comments on your pull request. What platisd/cla

Dimitris Platis 30 Dec 13, 2022
Under the hood working of transformers, fine-tuning GPT-3 models, DeBERTa, vision models, and the start of Metaverse, using a variety of NLP platforms: Hugging Face, OpenAI API, Trax, and AllenNLP

Transformers-for-NLP-2nd-Edition @copyright 2022, Packt Publishing, Denis Rothman Contact me for any question you have on LinkedIn Get the book on Ama

Denis Rothman 150 Dec 23, 2022
Sinkhorn Transformer - Practical implementation of Sparse Sinkhorn Attention

Sinkhorn Transformer This is a reproduction of the work outlined in Sparse Sinkhorn Attention, with additional enhancements. It includes a parameteriz

Phil Wang 217 Nov 25, 2022
Addon for adding subtitle files to blender VSE as Text sequences. Using pysub2 python module.

Import Subtitles for Blender VSE Addon for adding subtitle files to blender VSE as Text sequences. Using pysub2 python module. Supported formats by py

4 Feb 27, 2022
The ability of computer software to identify words and phrases in spoken language and convert them to human-readable text

speech-recognition-py Speech recognition is the ability of computer software to identify words and phrases in spoken language and convert them to huma

Deepangshi 1 Apr 03, 2022
Text-Based zombie apocalyptic decision-making game in Python

Inspiration We shared university first year game coursework.[to gauge previous experience and start brainstorming] Adapted a particular nuclear fallou

Amin Sabbagh 2 Feb 17, 2022
Finetune gpt-2 in google colab

gpt-2-colab finetune gpt-2 in google colab sample result (117M) from retraining on A Tale of Two Cities by Charles Di

212 Jan 02, 2023
DELTA is a deep learning based natural language and speech processing platform.

DELTA - A DEep learning Language Technology plAtform What is DELTA? DELTA is a deep learning based end-to-end natural language and speech processing p

DELTA 1.5k Dec 26, 2022
HiFi DeepVariant + WhatsHap workflowHiFi DeepVariant + WhatsHap workflow

HiFi DeepVariant + WhatsHap workflow Workflow steps align HiFi reads to reference with pbmm2 call small variants with DeepVariant, using two-pass meth

William Rowell 2 May 14, 2022
A Streamlit web app that generates Rick and Morty stories using GPT2.

Rick and Morty Story Generator This project uses a pre-trained GPT2 model, which was fine-tuned on Rick and Morty transcripts, to generate new stories

₸ornike 33 Oct 13, 2022
Code Generation using a large neural network called GPT-J

CodeGenX is a Code Generation system powered by Artificial Intelligence! It is delivered to you in the form of a Visual Studio Code Extension and is Free and Open-source!

DeepGenX 389 Dec 31, 2022
[ICCV 2021] Counterfactual Attention Learning for Fine-Grained Visual Categorization and Re-identification

Counterfactual Attention Learning Created by Yongming Rao*, Guangyi Chen*, Jiwen Lu, Jie Zhou This repository contains PyTorch implementation for ICCV

Yongming Rao 89 Dec 18, 2022
DeepPavlov Tutorials

DeepPavlov tutorials DeepPavlov: Sentence Classification with Word Embeddings DeepPavlov: Transfer Learning with BERT. Classification, Tagging, QA, Ze

Neural Networks and Deep Learning lab, MIPT 28 Sep 13, 2022
Official PyTorch implementation of "Dual Path Learning for Domain Adaptation of Semantic Segmentation".

Dual Path Learning for Domain Adaptation of Semantic Segmentation Official PyTorch implementation of "Dual Path Learning for Domain Adaptation of Sema

27 Dec 22, 2022
A simple Streamlit App to classify swahili news into different categories.

Swahili News Classifier Streamlit App A simple app to classify swahili news into different categories. Installation Install all streamlit requirements

Davis David 4 May 01, 2022