Official repository for the paper "Going Beyond Linear Transformers with Recurrent Fast Weight Programmers"

Last update: Nov 15, 2022

Overview

Recurrent Fast Weight Programmers

This is the official repository containing the code we used to produce the experimental results reported in the paper:

Going Beyond Linear Transformers with Recurrent Fast Weight Programmers

algorithmic directory for code execution and ListOps
language_modeling directory for language modeling
reinforcement_learning directory for RL

Separate license files can be found under each directory.

General instructions

Please refer to the readme file in each directory for further instructions.

In all tasks, our custom CUDA kernels will be automatically compiled. To avoid recompiling the code multiple times, we recommend to specify the path to a directory to store the compiled code via:

export TORCH_EXTENSIONS_DIR="/home/me/torch_extensions/lm"

Such a line is already included in the example scripts we provide. Please change the path to a safe directory of your choice.

Important: separate paths should be used for different tasks (i.e. here, one for language modeling, one for code execution, one for ListOps, and one for RL).

BibTex

@article{irie2021going,
      title={Going Beyond Linear Transformers with Recurrent Fast Weight Programmers}, 
      author={Kazuki Irie and Imanol Schlag and R\'obert Csord\'as and J\"urgen Schmidhuber},
      journal={Preprint arXiv:2106.06295},
      year={2021}
}

Official repository for the paper "Going Beyond Linear Transformers with Recurrent Fast Weight Programmers"

Related tags

Overview

Recurrent Fast Weight Programmers

Contents

General instructions

BibTex

Links

Owner

IDSIA

Politecnico of Turin Thesis: "Implementation and Evaluation of an Educational Chatbot based on NLP Techniques"

Unified MultiWOZ evaluation scripts for the context-to-response task.

A PyTorch Implementation of Neural IMage Assessment

PhysCap: Physically Plausible Monocular 3D Motion Capture in Real Time

N-Omniglot is a large neuromorphic few-shot learning dataset

Augmenting Physical Models with Deep Networks for Complex Dynamics Forecasting

Source code for Adaptively Calibrated Critic Estimates for Deep Reinforcement Learning

RobustART: Benchmarking Robustness on Architecture Design and Training Techniques

ScaleNet: A Shallow Architecture for Scale Estimation

ANN model for prediction a spatio-temporal distribution of supercooled liquid in mixed-phase clouds using Doppler cloud radar spectra.

Geometric Vector Perceptrons --- a rotation-equivariant GNN for learning from biomolecular structure

This program generates a random 12 digit/character password (upper and lowercase) and stores it in a file along with your username and app/website.

Python code to fuse multiple RGB-D images into a TSDF voxel volume.

As a part of the HAKE project, includes the reproduced SOTA models and the corresponding HAKE-enhanced versions (CVPR2020).

UmlsBERT: Clinical Domain Knowledge Augmentation of Contextual Embeddings Using the Unified Medical Language System Metathesaurus

Code release for our paper, "SimNet: Enabling Robust Unknown Object Manipulation from Pure Synthetic Data via Stereo"

A simple approach to emable dense segmentation with ViT.

A powerful framework for decentralized federated learning with user-defined communication topology

Code for WSDM 2022 paper, Contrastive Learning for Representation Degeneration Problem in Sequential Recommendation.

Algorithmic Trading using RNN