Implementation of Uniformer, a simple attention and 3d convolutional net that achieved SOTA in a number of video classification tasks

Last update: Nov 24, 2022

Overview

Uniformer - Pytorch

Implementation of Uniformer, a simple attention and 3d convolutional net that achieved SOTA in a number of video classification tasks

Install

$ pip install uniformer-pytorch

Usage

Uniformer-S

import torch
from uniformer_pytorch import Uniformer

model = Uniformer(
    num_classes = 1000,                 # number of output classes
    dims = (64, 128, 256, 512),         # feature dimensions per stage (4 stages)
    depths = (3, 4, 8, 3),              # depth at each stage
    mhsa_types = ('l', 'l', 'g', 'g')   # aggregation type at each stage, 'l' stands for local, 'g' stands for global
)

video = torch.randn(1, 3, 8, 224, 224)  # (batch, channels, time, height, width)

logits = model(video) # (1, 1000)

Uniformer-B

import torch
from uniformer_pytorch import Uniformer

model = Uniformer(
    num_classes = 1000
    depths = (5, 8, 20, 7)
)

Citations

@inproceedings{anonymous2022uniformer,
    title   = {UniFormer: Unified Transformer for Efficient Spatial-Temporal Representation Learning},
    author  = {Anonymous},
    booktitle = {Submitted to The Tenth International Conference on Learning Representations },
    year    = {2022},
    url     = {https://openreview.net/forum?id=nBU_u6DLvoK},
    note    = {under review}
}

Comments

About the re-implementation of Layernorm

Hi @lucidrains ,

Thanks for the implementation of Uniformer. After checking the code, I found that you have re-implemented the layernorm operation in the following. https://github.com/lucidrains/uniformer-pytorch/blob/78397000e647fd4035a7fa2bede999d59d4c3ded/uniformer_pytorch/uniformer_pytorch.py#L13-L23

I wonder if there are any differences between your re-implementation and the official code provided by PyTorch (i.e., torch.nn.Layernorm(dim) ).

Looking forward to your reply. Thx.

opened by ChongjianGE 2
Pretrained weights

Did you pretrain the uniformer on any of the benchmark datasets? (Kinetics, something-something, ...) If you did, would you be able to share the state dicts?

Thanks in advance!

opened by Fritskee 3

A PyTorch implementation of Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks

SVHNClassifier-PyTorch A PyTorch implementation of Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks If

182 Jan 3, 2023

U^2-Net - Portrait matting This repository explores possibilities of using the original u^2-net model for portrait matting.

104 Nov 25, 2022

U-2-Net: U Square Net - Modified for paired image training of style transfer

U2-Net: U Square Net Modified for paired image training of style transfer This is an unofficial repo making use of the code which was made available b

43 Oct 3, 2022

Implementation of TimeSformer, a pure attention-based solution for video classification

TimeSformer - Pytorch Implementation of TimeSformer, a pure and simple attention-based solution for reaching SOTA on video classification.

602 Jan 3, 2023

A PyTorch implementation for V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation

A PyTorch implementation of V-Net Vnet is a PyTorch implementation of the paper V-Net: Fully Convolutional Neural Networks for Volumetric Medical Imag

606 Dec 21, 2022

An implementation of the research paper "Retina Blood Vessel Segmentation Using A U-Net Based Convolutional Neural Network"

Retina Blood Vessels Segmentation This is an implementation of the research paper "Retina Blood Vessel Segmentation Using A U-Net Based Convolutional

23 Aug 20, 2022

U-Net Implementation: Convolutional Networks for Biomedical Image Segmentation" using the Carvana Image Masking Dataset in PyTorch

U-Net Implementation By Christopher Ley This is my interpretation and implementation of the famous paper "U-Net: Convolutional Networks for Biomedical

1 Jan 6, 2022

RETRO-pytorch - Implementation of RETRO, Deepmind's Retrieval based Attention net, in Pytorch

RETRO - Pytorch (wip) Implementation of RETRO, Deepmind's Retrieval based Attent

556 Jan 4, 2023

Implementation of 🦩 Flamingo, state-of-the-art few-shot visual question answering attention net out of Deepmind, in Pytorch

🦩 Flamingo - Pytorch Implementation of Flamingo, state-of-the-art few-shot visual question answering attention net, in Pytorch. It will include the p

630 Dec 28, 2022

Implementation of Uniformer, a simple attention and 3d convolutional net that achieved SOTA in a number of video classification tasks

Related tags

Overview

Uniformer - Pytorch

Install

Usage

Citations

You might also like...

A PyTorch implementation of Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks

U^2-Net - Portrait matting This repository explores possibilities of using the original u^2-net model for portrait matting.

U-2-Net: U Square Net - Modified for paired image training of style transfer

Implementation of TimeSformer, a pure attention-based solution for video classification

A PyTorch implementation for V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation

An implementation of the research paper "Retina Blood Vessel Segmentation Using A U-Net Based Convolutional Neural Network"

U-Net Implementation: Convolutional Networks for Biomedical Image Segmentation" using the Carvana Image Masking Dataset in PyTorch

RETRO-pytorch - Implementation of RETRO, Deepmind's Retrieval based Attention net, in Pytorch

Implementation of 🦩 Flamingo, state-of-the-art few-shot visual question answering attention net out of Deepmind, in Pytorch

Comments

About the re-implementation of Layernorm

Pretrained weights

Releases(0.0.4)

0.0.4(Apr 22, 2022)

0.0.3(Nov 17, 2021)

0.0.2(Nov 16, 2021)

0.0.1(Nov 16, 2021)

Owner

Phil Wang

Half Instance Normalization Network for Image Restoration

Syllabus del curso IIC2115 - Programación como Herramienta para la Ingeniería 2022/I

SPTAG: A library for fast approximate nearest neighbor search

Official implementation of ACTION-Net: Multipath Excitation for Action Recognition (CVPR'21).

Implementation of CSRL from the AAAI2022 paper: Constraint Sampling Reinforcement Learning: Incorporating Expertise For Faster Learning

PyTorch implementation of DeepDream algorithm

Unified Pre-training for Self-Supervised Learning and Supervised Learning for ASR

HiddenMarkovModel implements hidden Markov models with Gaussian mixtures as distributions on top of TensorFlow

BASH - Biomechanical Animated Skinned Human

git《FSCE: Few-Shot Object Detection via Contrastive Proposal Encoding》(CVPR 2021) GitHub: [fig8]

Example-custom-ml-block-keras - Custom Keras ML block example for Edge Impulse

Multi-task Self-supervised Object Detection via Recycling of Bounding Box Annotations (CVPR, 2019)

Perform zero-order Hankel Transform for an 1D array (float or real valued).

[ICRA 2022] An opensource framework for cooperative detection. Official implementation for OPV2V.

RaceBERT -- A transformer based model to predict race and ethnicty from names

StarGAN - Official PyTorch Implementation (CVPR 2018)

this is a lite easy to use virtual keyboard project for anyone to use

This is a GUI interface which can process forest fire detection, smoke detection and fire segmentation

Pytorch implementation of Supporting Clustering with Contrastive Learning, NAACL 2021

Categorical Depth Distribution Network for Monocular 3D Object Detection