My implementation of DeepMind's Perceiver

Overview

DeepMind Perceiver (in PyTorch)

Disclaimer: This is not official and I'm not affiliated with DeepMind.

My implementation of the Perceiver: General Perception with Iterative Attention. You can read more about the model on DeepMind's website.

I trained an MNIST model which you can find in models/mnist.pkl or by using perceiver.load_mnist_model(). It gets 96.02% on the test-data.

Getting started

To run this you need PyTorch installed:

pip3 install torch

From perceiver you can import Perceiver or PerceiverLogits.

Then you can use it as such (or look in examples.ipynb):

from perceiver import Perceiver

model = Perceiver(
    input_channels, # <- How many channels in the input? E.g. 3 for RGB.
    input_shape, # <- How big is the input in the different dimensions? E.g. (28, 28) for MNIST
    fourier_bands=4, # <- How many bands should the positional encoding have?
    latents=64, # <- How many latent vectors?
    d_model=32, # <- Model dimensionality. Every pixel/token/latent vector will have this size.
    heads=8, # <- How many heads in self-attention? Cross-attention always has 1 head.
    latent_blocks=6, # <- How much latent self-attention for each cross attention with the input?
    dropout=0.1, # <- Dropout
    layers=8, # <- This will become two unique layer-blocks: layer 1 and layer 2-8 (using weight sharing).
)

The above model outputs the latents after the final layer. If you want logits instead, use the following model:

from perceiver import PerceiverLogits

model = PerceiverLogits(
    input_channels, # <- How many channels in the input? E.g. 3 for RGB.
    input_shape, # <- How big is the input in the different dimensions? E.g. (28, 28) for MNIST
    output_features, # <- How many different classes? E.g. 10 for MNIST.
    fourier_bands=4, # <- How many bands should the positional encoding have?
    latents=64, # <- How many latent vectors?
    d_model=32, # <- Model dimensionality. Every pixel/token/latent vector will have this size.
    heads=8, # <- How many heads in self-attention? Cross-attention always has 1 head.
    latent_blocks=6, # <- How much latent self-attention for each cross attention with the input?
    dropout=0.1, # <- Dropout
    layers=8, # <- This will become two unique layer-blocks: layer 1 and layer 2-8 (using weight sharing).
)

To use my pre-trained MNIST model (not very good):

from perceiver import load_mnist_model

model = load_mnist_model()

TODO:

  • Positional embedding generalized to n dimensions (with fourier features)
  • Train other models (like CIFAR-100 or something not in the image domain)
  • Type indication
  • Unit tests for components of model
  • Package
Owner
Louis Arge
Experienced full-stack developer. Self-studying machine learning.
Louis Arge
Code to accompany our paper "Continual Learning Through Synaptic Intelligence" ICML 2017

Continual Learning Through Synaptic Intelligence This repository contains code to reproduce the key findings of our path integral approach to prevent

Ganguli Lab 82 Nov 03, 2022
[ECCV2020] Content-Consistent Matching for Domain Adaptive Semantic Segmentation

[ECCV20] Content-Consistent Matching for Domain Adaptive Semantic Segmentation This is a PyTorch implementation of CCM. News: GTA-4K list is available

Guangrui Li 88 Aug 25, 2022
Unofficial implementation of PatchCore anomaly detection

PatchCore anomaly detection Unofficial implementation of PatchCore(new SOTA) anomaly detection model Original Paper : Towards Total Recall in Industri

Changwoo Ha 268 Dec 22, 2022
Implementation of ConvMixer for "Patches Are All You Need? 🤷"

Patches Are All You Need? 🤷 This repository contains an implementation of ConvMixer for the ICLR 2022 submission "Patches Are All You Need?" by Asher

CMU Locus Lab 934 Jan 08, 2023
Using deep actor-critic model to learn best strategies in pair trading

Deep-Reinforcement-Learning-in-Stock-Trading Using deep actor-critic model to learn best strategies in pair trading Abstract Partially observed Markov

281 Dec 09, 2022
Learning embeddings for classification, retrieval and ranking.

StarSpace StarSpace is a general-purpose neural model for efficient learning of entity embeddings for solving a wide variety of problems: Learning wor

Facebook Research 3.8k Dec 22, 2022
Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion (CVPR'2021, Oral)

DSA^2 F: Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion (CVPR'2021, Oral) This repo is the official imp

如今我已剑指天涯 46 Dec 21, 2022
A Dataset for Direct Quotation Extraction and Attribution in News Articles.

DirectQuote - A Dataset for Direct Quotation Extraction and Attribution in News Articles DirectQuote is a corpus containing 19,760 paragraphs and 10,3

THUNLP-MT 9 Sep 23, 2022
Training a Resilient Q-Network against Observational Interference, Causal Inference Q-Networks

Obs-Causal-Q-Network AAAI 2022 - Training a Resilient Q-Network against Observational Interference Preprint | Slides | Colab Demo | Environment Setup

23 Nov 21, 2022
Code for "Unsupervised Layered Image Decomposition into Object Prototypes" paper

DTI-Sprites Pytorch implementation of "Unsupervised Layered Image Decomposition into Object Prototypes" paper Check out our paper and webpage for deta

40 Dec 22, 2022
Tensorflow-Project-Template - A best practice for tensorflow project template architecture.

Tensorflow Project Template A simple and well designed structure is essential for any Deep Learning project, so after a lot of practice and contributi

Mahmoud G. Salem 3.6k Dec 22, 2022
Bayesian inference for Permuton-induced Chinese Restaurant Process (NeurIPS2021).

Permuton-induced Chinese Restaurant Process Note: Currently only the Matlab version is available, but a Python version will be available soon! This is

NTT Communication Science Laboratories 3 Dec 17, 2022
Jax/Flax implementation of Variational-DiffWave.

jax-variational-diffwave Jax/Flax implementation of Variational-DiffWave. (Zhifeng Kong et al., 2020, Diederik P. Kingma et al., 2021.) DiffWave with

YoungJoong Kim 37 Dec 16, 2022
HeatNet is a python package that provides tools to build, train and evaluate neural networks designed to predict extreme heat wave events globally on daily to subseasonal timescales.

HeatNet HeatNet is a python package that provides tools to build, train and evaluate neural networks designed to predict extreme heat wave events glob

Google Research 6 Jul 07, 2022
Implementation of the SUMO (Slim U-Net trained on MODA) model

SUMO - Slim U-Net trained on MODA Implementation of the SUMO (Slim U-Net trained on MODA) model as described in: TODO: add reference to paper once ava

6 Nov 19, 2022
Ranger deep learning optimizer rewrite to use newest components

Ranger21 - integrating the latest deep learning components into a single optimizer Ranger deep learning optimizer rewrite to use newest components Ran

Less Wright 266 Dec 28, 2022
A Review of Deep Learning Techniques for Markerless Human Motion on Synthetic Datasets

HOW TO USE THIS PROJECT A Review of Deep Learning Techniques for Markerless Human Motion on Synthetic Datasets Based on DeepLabCut toolbox, we run wit

1 Jan 10, 2022
Code Repo for the ACL21 paper "Common Sense Beyond English: Evaluating and Improving Multilingual LMs for Commonsense Reasoning"

Common Sense Beyond English: Evaluating and Improving Multilingual LMs for Commonsense Reasoning This is the Github repository of our paper, "Common S

INK Lab @ USC 19 Nov 30, 2022
SegNet including indices pooling for Semantic Segmentation with tensorflow and keras

SegNet SegNet is a model of semantic segmentation based on Fully Comvolutional Network. This repository contains the implementation of learning and te

Yuta Kamikawa 172 Dec 23, 2022
HyperCube: Implicit Field Representations of Voxelized 3D Models

HyperCube: Implicit Field Representations of Voxelized 3D Models Authors: Magdalena Proszewska, Marcin Mazur, Tomasz Trzcinski, Przemysław Spurek [Pap

Magdalena Proszewska 3 Mar 09, 2022