Fastformer

Notes from the authors

Pytorch/Keras implementation of Fastformer. The keras version only includes the core fastformer attention part. The pytorch version is written in a huggingface transformers style. The jupyter notebooks contain the quickstart codes for text classification on AG's News (without pretrained word embeddings for simplicity), which can be directly run. We noticed that in our experiments, NOT all tasks need FFNN, residual connection, layer normalization and even position embedding. For example, we find that in news recommendation, it is better to directly use Fastformer without layer normalization and position embedding. However, in Ad CVR prediction, both position embedding and layer normalization are needed.

Keras version: 2.2.4 (may not be compatible with higher versions)

TF version: from 1.12 to 1.15 (may be compatible with lower versions)

Pytorch version: 1.6.0 (may be compatible with higher/lower versions)

Citation

@article{wu2021fastformer,
  title={Fastformer: Additive Attention Can Be All You Need},
  author={Wu, Chuhan and Wu, Fangzhao and Qi, Tao and Huang, Yongfeng},
  journal={arXiv preprint arXiv:2108.09084},
  year={2021}
}

A pytorch &keras implementation and demo of Fastformer.

Related tags

Overview

Fastformer

Notes from the authors

Citation

Owner

Tensorflow 2 implementation of our high quality frame interpolation neural network

Blender Add-on that sets a Material's Base Color to one of Pantone's Colors of the Year

A tensorflow=1.13 implementation of Deconvolutional Networks on Graph Data (NeurIPS 2021)

Accelerated NLP pipelines for fast inference on CPU and GPU. Built with Transformers, Optimum and ONNX Runtime.

Orange Chicken: Data-driven Model Generalizability in Crosslinguistic Low-resource Morphological Segmentation

Generic Event Boundary Detection: A Benchmark for Event Segmentation

Using Random Effects to Account for High-Cardinality Categorical Features and Repeated Measures in Deep Neural Networks

Image reconstruction done with untrained neural networks.

pip install python-office

PyTorch reimplementation of Diffusion Models

The official implementation of Theme Transformer

Pytorch Implementation of Residual Vision Transformers(ResViT)

DexterRedTool - Dexter's Red Team Tool that creates cronjob/task scheduler to consistently creates users

OverFeat is a Convolutional Network-based image classifier and feature extractor.

商品推荐系统

Official implementation for Scale-Aware Neural Architecture Search for Multivariate Time Series Forecasting

Code for the ACL2021 paper "Lexicon Enhanced Chinese Sequence Labelling Using BERT Adapter"

FCOSR: A Simple Anchor-free Rotated Detector for Aerial Object Detection

BossNAS: Exploring Hybrid CNN-transformers with Block-wisely Self-supervised Neural Architecture Search

Tensors and Dynamic neural networks in Python with strong GPU acceleration