Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr\"om Method (NeurIPS 2021)

Overview

Skyformer

This repository is the official implementation of Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr"om Method (NeurIPS 2021).

image

Requirements

To install requirements in a conda environment:

conda create -n skyformer python=3.6
conda activate skyformer
pip install -r requirements.txt

Note: Specific requirements for data preprocessing are not included here.

Data Preparation

Processed files can be downloaded here, or processed with the following steps:

  1. Requirements
tensorboard>=2.3.0
tensorflow>=2.3.1
tensorflow-datasets>=4.0.1
  1. Download the TFDS files for pathfinder and then set _PATHFINER_TFDS_PATH to the unzipped directory (following https://github.com/google-research/long-range-arena/issues/11)
  2. Download lra_release.gz (7.7 GB).
  3. Unzip lra-release and put under ./data/.
cd data
wget https://storage.googleapis.com/long-range-arena/lra_release.gz
tar zxvf lra-release.gz 
  1. Create a directory lra_processed under ./data/.
mkdir lra_processed
cd ..

6.The directory structure would be (assuming the root dir is code)

./data/lra-processed
./data/long-range-arena-main
./data/lra_release
  1. Create train, dev, and test dataset pickle files for each task.
cd preprocess
python create_pathfinder.py
python create_listops.py
python create_retrieval.py
python create_text.py
python create_cifar10.py

Note: most source code comes from LRA repo.

Run

Modify the configuration in config.py and run

python main.py --mode train --attn skyformer --task lra-text
  • mode: train, eval
  • attn: softmax, nystrom, linformer, reformer, perfromer, informer, bigbird, kernelized, skyformer
  • task: lra-listops, lra-pathfinder, lra-retrieval, lra-text, lra-image

Reference

@inproceedings{Skyformer,
    title={Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr\"om Method}, 
    author={Yifan Chen and Qi Zeng and Heng Ji and Yun Yang},
    booktitle={NeurIPS},
    year={2021}
}
Owner
Qi Zeng
CS Ph.D. candidate at UIUC
Qi Zeng
This repository contains the code and models for the following paper.

DC-ShadowNet Introduction This is an implementation of the following paper DC-ShadowNet: Single-Image Hard and Soft Shadow Removal Using Unsupervised

AuAgCu 65 Dec 27, 2022
A Python Library for Graph Outlier Detection (Anomaly Detection)

PyGOD is a Python library for graph outlier detection (anomaly detection). This exciting yet challenging field has many key applications, e.g., detect

PyGOD Team 757 Jan 04, 2023
LRBoost is a scikit-learn compatible approach to performing linear residual based stacking/boosting.

LRBoost is a sckit-learn compatible package for linear residual boosting. LRBoost combines a linear estimator and a non-linear estimator to leverage t

Andrew Patton 5 Nov 23, 2022
Deep learning with dynamic computation graphs in TensorFlow

TensorFlow Fold TensorFlow Fold is a library for creating TensorFlow models that consume structured data, where the structure of the computation graph

1.8k Dec 28, 2022
ML model to classify between cats and dogs

Cats-and-dogs-classifier This is my first ML model which can classify between cats and dogs. Here the accuracy is around 75%, however , the accuracy c

Sharath V 4 Aug 20, 2021
Face Mesh is a face geometry solution that estimates 468 3D face landmarks in real-time even on mobile devices

Face-Mesh Face Mesh is a face geometry solution that estimates 468 3D face landmarks in real-time even on mobile devices. It employs machine learning

Farnam Javadi 9 Dec 21, 2022
Newt - a Gaussian process library in JAX.

Newt __ \/_ (' \`\ _\, \ \\/ /`\/\ \\ \ \\

AaltoML 0 Nov 02, 2021
Source code for CIKM 2021 paper for Relation-aware Heterogeneous Graph for User Profiling

RHGN Source code for CIKM 2021 paper for Relation-aware Heterogeneous Graph for User Profiling Dependencies torch==1.6.0 torchvision==0.7.0 dgl==0.7.1

Big Data and Multi-modal Computing Group, CRIPAC 6 Nov 29, 2022
Semi-Supervised Learning with Ladder Networks in Keras. Get 98% test accuracy on MNIST with just 100 labeled examples !

Semi-Supervised Learning with Ladder Networks in Keras This is an implementation of Ladder Network in Keras. Ladder network is a model for semi-superv

Divam Gupta 101 Sep 07, 2022
Survival analysis in Python

What is survival analysis and why should I learn it? Survival analysis was originally developed and applied heavily by the actuarial and medical commu

Cameron Davidson-Pilon 2k Jan 08, 2023
PyTorch implementation of MLP-Mixer

PyTorch implementation of MLP-Mixer MLP-Mixer: an all-MLP architecture composed of alternate token-mixing and channel-mixing operations. The token-mix

Duo Li 33 Nov 27, 2022
Official repo for AutoInt: Automatic Integration for Fast Neural Volume Rendering in CVPR 2021

AutoInt: Automatic Integration for Fast Neural Volume Rendering CVPR 2021 Project Page | Video | Paper PyTorch implementation of automatic integration

Stanford Computational Imaging Lab 149 Dec 22, 2022
Official code for 'Weakly-supervised Video Anomaly Detection with Robust Temporal Feature Magnitude Learning' [ICCV 2021]

RTFM This repo contains the Pytorch implementation of our paper: Weakly-supervised Video Anomaly Detection with Robust Temporal Feature Magnitude Lear

Yu Tian 242 Jan 08, 2023
the code for our CVPR 2021 paper Bilateral Grid Learning for Stereo Matching Network [BGNet]

BGNet This repository contains the code for our CVPR 2021 paper Bilateral Grid Learning for Stereo Matching Network [BGNet] Environment Python 3.6.* C

3DCV developer 87 Nov 29, 2022
PyTorch implementation of MulMON

MulMON This repository contains a PyTorch implementation of the paper: Learning Object-Centric Representations of Multi-object Scenes from Multiple Vi

NanboLi 16 Nov 03, 2022
PyTorch implementation of PSPNet

PSPNet with PyTorch Unofficial implementation of "Pyramid Scene Parsing Network" (https://arxiv.org/abs/1612.01105). This repository is just for caffe

Kazuto Nakashima 52 Nov 16, 2022
ZEBRA: Zero Evidence Biometric Recognition Assessment

ZEBRA: Zero Evidence Biometric Recognition Assessment license: LGPLv3 - please reference our paper version: 2020-06-11 author: Andreas Nautsch (EURECO

Voice Privacy Challenge 2 Dec 12, 2021
Source code for EquiDock: Independent SE(3)-Equivariant Models for End-to-End Rigid Protein Docking (ICLR 2022)

Source code for EquiDock: Independent SE(3)-Equivariant Models for End-to-End Rigid Protein Docking (ICLR 2022) Please cite "Independent SE(3)-Equivar

Octavian Ganea 154 Jan 02, 2023
OntoProtein: Protein Pretraining With Ontology Embedding

OntoProtein This is the implement of the paper "OntoProtein: Protein Pretraining With Ontology Embedding". OntoProtein is an effective method that mak

ZJUNLP 80 Dec 14, 2022
Deeply Supervised, Layer-wise Prediction-aware (DSLP) Transformer for Non-autoregressive Neural Machine Translation

Non-Autoregressive Translation with Layer-Wise Prediction and Deep Supervision Training Efficiency We show the training efficiency of our DSLP model b

Chenyang Huang 36 Oct 31, 2022