Implementation of Deformable Attention in Pytorch from the paper "Vision Transformer with Deformable Attention"

Last update: Dec 24, 2022

Overview

Deformable Attention

Implementation of Deformable Attention from this paper in Pytorch, which appears to be an improvement to what was proposed in DETR. The relative positional embedding has also been modified for better extrapolation, using the Continuous Positional Embedding proposed in SwinV2.

Install

$ pip install deformable-attention

Usage

import torch
from deformable_attention import DeformableAttention

attn = DeformableAttention(
    dim = 512,                   # feature dimensions
    dim_head = 64,               # dimension per head
    heads = 8,                   # attention heads
    dropout = 0.,                # dropout
    downsample_factor = 4,       # downsample factor (r in paper)
    offset_scale = 4,            # scale of offset, maximum offset
    offset_groups = None,        # number of offset groups, should be multiple of heads
    offset_kernel_size = 6,      # offset kernel size
)

x = torch.randn(1, 512, 64, 64)
attn(x) # (1, 512, 64, 64)

3d deformable attention

import torch
from deformable_attention import DeformableAttention3D

attn = DeformableAttention3D(
    dim = 512,                          # feature dimensions
    dim_head = 64,                      # dimension per head
    heads = 8,                          # attention heads
    dropout = 0.,                       # dropout
    downsample_factor = (2, 8, 8),      # downsample factor (r in paper)
    offset_scale = (2, 8, 8),           # scale of offset, maximum offset
    offset_kernel_size = (4, 10, 10),   # offset kernel size
)

x = torch.randn(1, 512, 10, 32, 32) # (batch, dimension, frames, height, width)
attn(x) # (1, 512, 10, 32, 32)

1d deformable attention for good measure

import torch
from deformable_attention import DeformableAttention1D

attn = DeformableAttention1D(
    dim = 128,
    downsample_factor = 4,
    offset_scale = 2,
    offset_kernel_size = 6
)

x = torch.randn(1, 128, 512)
attn(x) # (1, 128, 512)

Citation

@misc{xia2022vision,
    title   = {Vision Transformer with Deformable Attention}, 
    author  = {Zhuofan Xia and Xuran Pan and Shiji Song and Li Erran Li and Gao Huang},
    year    = {2022},
    eprint  = {2201.00520},
    archivePrefix = {arXiv},
    primaryClass = {cs.CV}
}

@misc{liu2021swin,
    title   = {Swin Transformer V2: Scaling Up Capacity and Resolution},
    author  = {Ze Liu and Han Hu and Yutong Lin and Zhuliang Yao and Zhenda Xie and Yixuan Wei and Jia Ning and Yue Cao and Zheng Zhang and Li Dong and Furu Wei and Baining Guo},
    year    = {2021},
    eprint  = {2111.09883},
    archivePrefix = {arXiv},
    primaryClass = {cs.CV}
}

Some code of the implements of Geological Modeling Using 3D Pixel-Adaptive and Deformable Convolutional Neural Network

3D-GMPDCNN Geological Modeling Using 3D Pixel-Adaptive and Deformable Convolutional Neural Network PyTorch implementation of "Geological Modeling Usin

5 Nov 21, 2022

MoCoPnet - Deformable 3D Convolution for Video Super-Resolution

Deformable 3D Convolution for Video Super-Resolution Pytorch implementation of l

28 Dec 15, 2022

3D2Unet: 3D Deformable Unet for Low-Light Video Enhancement (PRCV2021)

3DDUNET This is the code for 3D2Unet: 3D Deformable Unet for Low-Light Video Enhancement (PRCV2021) Conference Paper Link Dataset We use SMOID dataset

1 Jan 7, 2022

Implementation of the 😇 Attention layer from the paper, Scaling Local Self-Attention For Parameter Efficient Visual Backbones

HaloNet - Pytorch Implementation of the Attention layer from the paper, Scaling Local Self-Attention For Parameter Efficient Visual Backbones. This re

189 Nov 22, 2022

Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attention Does Not Need O(n²) Memory"

Memory Efficient Attention Pytorch Implementation of a memory efficient multi-head attention as proposed in the paper, Self-attention Does Not Need O(

180 Jan 5, 2023

Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image classification, in Pytorch

Transformer in Transformer Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image c

272 Dec 23, 2022

Official Pytorch Implementation of Relational Self-Attention: What's Missing in Attention for Video Understanding

Relational Self-Attention: What's Missing in Attention for Video Understanding This repository is the official implementation of "Relational Self-Atte

43 Dec 7, 2022

The official pytorch implementation of our paper "Is Space-Time Attention All You Need for Video Understanding?"

TimeSformer This is an official pytorch implementation of Is Space-Time Attention All You Need for Video Understanding?. In this repository, we provid

1k Dec 31, 2022

An implementation demo of the ICLR 2021 paper Neural Attention Distillation: Erasing Backdoor Triggers from Deep Neural Networks in PyTorch.

Neural Attention Distillation This is an implementation demo of the ICLR 2021 paper Neural Attention Distillation: Erasing Backdoor Triggers from Deep

84 Jan 4, 2023

Comments

The relationship between 'dim' and 'inner_dim'

Hi, I have a question about DeformableAttention module,

I calculated the output volumes of the forward processes step by step, According to my calculation, the code works only when 'dim' and 'inner_dim' is same.

Is there any reason why you implement it this way?

Best regards, Hankyu

opened by hanq0212 4
TypeError: meshgrid() got an unexpected keyword argument 'indexing'

@lucidrains while trying to perform import torch from deformable_attention import DeformableAttention

attn = DeformableAttention( dim = 512, # feature dimensions dim_head = 64, # dimension per head heads = 8, # attention heads dropout = 0., # dropout downsample_factor = 4, # downsample factor (r in paper) offset_scale = 4, # scale of offset, maximum offset offset_groups = None, # number of offset groups, should be multiple of heads offset_kernel_size = 6, # offset kernel size )

x = torch.randn(1, 512, 64, 64) attn(x)

Got error below from the line.. Kindly help

https://github.com/lucidrains/deformable-attention/blob/9f3c0ae35652ce54687e0db409921018bfca3310/deformable_attention/deformable_attention_2d.py#L26

opened by ChidanandKumarKS 1

Releases(0.0.16)

0.0.16(Apr 21, 2022)

Source code(tar.gz)
Source code(zip)
0.0.15(Mar 24, 2022)

Source code(tar.gz)
Source code(zip)
0.0.14(Mar 24, 2022)

Source code(tar.gz)
Source code(zip)
0.0.12(Mar 24, 2022)

Source code(tar.gz)
Source code(zip)
0.0.11(Mar 24, 2022)

Source code(tar.gz)
Source code(zip)
0.0.10(Mar 24, 2022)

Source code(tar.gz)
Source code(zip)
0.0.9(Mar 21, 2022)

Source code(tar.gz)
Source code(zip)
0.0.8(Mar 21, 2022)

Source code(tar.gz)
Source code(zip)
0.0.7(Mar 21, 2022)

Source code(tar.gz)
Source code(zip)
0.0.6(Mar 21, 2022)

Source code(tar.gz)
Source code(zip)
0.0.5(Mar 17, 2022)

Source code(tar.gz)
Source code(zip)
0.0.4(Mar 17, 2022)

Source code(tar.gz)
Source code(zip)
0.0.3(Mar 17, 2022)

Source code(tar.gz)
Source code(zip)
0.0.2(Mar 17, 2022)

Source code(tar.gz)
Source code(zip)
0.0.1(Mar 17, 2022)

Source code(tar.gz)
Source code(zip)

Owner

Phil Wang

Working with Attention. It's all we need

GitHub Repository

GeDML is an easy-to-use generalized deep metric learning library

32 Dec 05, 2022

SemiNAS: Semi-Supervised Neural Architecture Search

SemiNAS: Semi-Supervised Neural Architecture Search This repository contains the code used for Semi-Supervised Neural Architecture Search, by Renqian

21 Aug 31, 2022

Neural style transfer in PyTorch.

style-transfer-pytorch An implementation of neural style transfer (A Neural Algorithm of Artistic Style) in PyTorch, supporting CPUs and Nvidia GPUs.

395 Jan 06, 2023

a reimplementation of Optical Flow Estimation using a Spatial Pyramid Network in PyTorch

pytorch-spynet This is a personal reimplementation of SPyNet [1] using PyTorch. Should you be making use of this work, please cite the paper according

269 Jan 02, 2023

No Code AI/ML platform

NoCodeAIML No Code AI/ML platform - Community Edition Video credits: Uday Kiran Typical No Code AI/ML Platform will have features like drag and drop,

5 Jan 28, 2022

Source Code for DialogBERT: Discourse-Aware Response Generation via Learning to Recover and Rank Utterances (https://arxiv.org/pdf/2012.01775.pdf)

DialogBERT This is a PyTorch implementation of the DialogBERT model described in DialogBERT: Neural Response Generation via Hierarchical BERT with Dis

67 Jan 06, 2023

Prompts - Read a textfile of prompts and import into anki via ankiconnect

prompts read a textfile of prompts and import into anki via ankiconnect Usage In

2 Jul 28, 2022

GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation. (CVPR 2021)

GDR-Net This repo provides the PyTorch implementation of the work: Gu Wang, Fabian Manhardt, Federico Tombari, Xiangyang Ji. GDR-Net: Geometry-Guided

169 Jan 07, 2023

Pose Detection and Machine Learning for real-time body posture analysis during exercise to provide audiovisual feedback on improvement of form.

Posture: Pose Tracking and Machine Learning for prescribing corrective suggestions to improve posture and form while exercising. This repository conta

10 Nov 11, 2022

Can we visualize a large scientific data set with a surrogate model? We're building a GAN for the Earth's Mantle Convection data set to see if we can!

EarthGAN - Earth Mantle Surrogate Modeling Can a surrogate model of the Earth’s Mantle Convection data set be built such that it can be readily run in

0 Dec 09, 2021

[CVPR2021] The source code for our paper 《Removing the Background by Adding the Background: Towards Background Robust Self-supervised Video Representation Learning》.

TBE The source code for our paper "Removing the Background by Adding the Background: Towards Background Robust Self-supervised Video Representation Le

150 Dec 28, 2022

Implementation of Deformable Attention in Pytorch from the paper "Vision Transformer with Deformable Attention"

Related tags

Overview

Deformable Attention

Install

Usage

Citation

You might also like...

Some code of the implements of Geological Modeling Using 3D Pixel-Adaptive and Deformable Convolutional Neural Network

MoCoPnet - Deformable 3D Convolution for Video Super-Resolution

3D2Unet: 3D Deformable Unet for Low-Light Video Enhancement (PRCV2021)

Implementation of the 😇 Attention layer from the paper, Scaling Local Self-Attention For Parameter Efficient Visual Backbones

Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attention Does Not Need O(n²) Memory"

Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image classification, in Pytorch

Official Pytorch Implementation of Relational Self-Attention: What's Missing in Attention for Video Understanding

The official pytorch implementation of our paper "Is Space-Time Attention All You Need for Video Understanding?"

An implementation demo of the ICLR 2021 paper Neural Attention Distillation: Erasing Backdoor Triggers from Deep Neural Networks in PyTorch.

Comments

The relationship between 'dim' and 'inner_dim'

TypeError: meshgrid() got an unexpected keyword argument 'indexing'

Releases(0.0.16)

0.0.16(Apr 21, 2022)

0.0.15(Mar 24, 2022)

0.0.14(Mar 24, 2022)

0.0.12(Mar 24, 2022)

0.0.11(Mar 24, 2022)

0.0.10(Mar 24, 2022)

0.0.9(Mar 21, 2022)

0.0.8(Mar 21, 2022)

0.0.7(Mar 21, 2022)

0.0.6(Mar 21, 2022)

0.0.5(Mar 17, 2022)

0.0.4(Mar 17, 2022)

0.0.3(Mar 17, 2022)

0.0.2(Mar 17, 2022)

0.0.1(Mar 17, 2022)

Owner

Phil Wang

GeDML is an easy-to-use generalized deep metric learning library

SemiNAS: Semi-Supervised Neural Architecture Search

Neural style transfer in PyTorch.

a reimplementation of Optical Flow Estimation using a Spatial Pyramid Network in PyTorch

No Code AI/ML platform

Source Code for DialogBERT: Discourse-Aware Response Generation via Learning to Recover and Rank Utterances (https://arxiv.org/pdf/2012.01775.pdf)

Prompts - Read a textfile of prompts and import into anki via ankiconnect

GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation. (CVPR 2021)

Pose Detection and Machine Learning for real-time body posture analysis during exercise to provide audiovisual feedback on improvement of form.

Can we visualize a large scientific data set with a surrogate model? We're building a GAN for the Earth's Mantle Convection data set to see if we can!

PRIME: A Few Primitives Can Boost Robustness to Common Corruptions

Implementation of Multistream Transformers in Pytorch

Pytorch implementation of AREL

tf2-keras implement yolov5

This is an open solution to the Home Credit Default Risk challenge 🏡

The InterScript dataset contains interactive user feedback on scripts generated by a T5-XXL model.

🤗 Paper Style Guide

CowHerd is a partially-observed reinforcement learning environment

Hcpy - Interface with Home Connect appliances in Python

[CVPR2021] The source code for our paper 《Removing the Background by Adding the Background: Towards Background Robust Self-supervised Video Representation Learning》.