Awesome-Attention-Mechanism-in-cv

Introduction
Attention Mechanism
Plug and Play Module
Evaluation
Paper List
Contribute

Introduction

PyTorch实现多种计算机视觉中网络设计中用到的Attention机制，还收集了一些即插即用模块。由于能力有限精力有限，可能很多模块并没有包括进来，有任何的建议或者改进，可以提交issue或者进行PR。

Attention Mechanism

Paper	Publish	Link	Main Idea	Blog
Global Second-order Pooling Convolutional Networks	CVPR19	GSoPNet	将高阶和注意力机制在网络中部地方结合起来
Neural Architecture Search for Lightweight Non-Local Networks	CVPR20	AutoNL	NAS+LightNL
Squeeze and Excitation Network	CVPR18	SENet	最经典的通道注意力	zhihu
Selective Kernel Network	CVPR19	SKNet	SE+动态选择	zhihu
Convolutional Block Attention Module	ECCV18	CBAM	串联空间+通道注意力	zhihu
BottleNeck Attention Module	BMVC18	BAM	并联空间+通道注意力	zhihu
Concurrent Spatial and Channel ‘Squeeze & Excitation’ in Fully Convolutional Networks	MICCAI18	scSE	并联空间+通道注意力	zhihu
Non-local Neural Networks	CVPR19	Non-Local(NL)	self-attention	zhihu
GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond	ICCVW19	GCNet	对NL进行改进	zhihu
CCNet: Criss-Cross Attention for Semantic Segmentation	ICCV19	CCNet	对NL改进
SA-Net:shuffle attention for deep convolutional neural networks	ICASSP 21	SANet	SGE+channel shuffle	zhihu
ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks	CVPR20	ECANet	SE的改进
Spatial Group-wise Enhance: Improving Semantic Feature Learning in Convolutional Networks	CoRR19	SGENet	Group+spatial+channel
FcaNet: Frequency Channel Attention Networks	CoRR20	FcaNet	频域上的SE操作
$A^2\text{-}Nets$: Double Attention Networks	NeurIPS18	DANet	NL的思想应用到空间和通道
Asymmetric Non-local Neural Networks for Semantic Segmentation	ICCV19	APNB	spp+NL
Efficient Attention: Attention with Linear Complexities	CoRR18	EfficientAttention	NL降低计算量
Image Restoration via Residual Non-local Attention Networks	ICLR19	RNAN
Exploring Self-attention for Image Recognition	CVPR20	SAN	理论性很强，实现起来很简单
An Empirical Study of Spatial Attention Mechanisms in Deep Networks	ICCV19	None	MSRA综述self-attention
Object-Contextual Representations for Semantic Segmentation	ECCV20	OCRNet	复杂的交互机制，效果确实好
IAUnet: Global Context-Aware Feature Learning for Person Re-Identification	TTNNLS20	IAUNet	引入时序信息
ResNeSt: Split-Attention Networks	CoRR20	ResNeSt	SK+ResNeXt
Gather-Excite: Exploiting Feature Context in Convolutional Neural Networks	NeurIPS18	GENet	SE续作
Improving Convolutional Networks with Self-calibrated Convolutions	CVPR20	SCNet	自校正卷积
Rotate to Attend: Convolutional Triplet Attention Module	WACV21	TripletAttention	CHW两两互相融合
Dual Attention Network for Scene Segmentation	CVPR19	DANet	self-attention
Relation-Aware Global Attention for Person Re-identification	CVPR20	RGA	用于reid
Attentional Feature Fusion	WACV21	AFF	特征融合的attention方法
An Attentive Survey of Attention Models	CoRR19	None	包括NLP/CV/推荐系统等方面的注意力机制
Stand-Alone Self-Attention in Vision Models	NeurIPS19	FullAttention	全部的卷积都替换为self-attention
BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation	ECCV18	BiSeNet	类似FPN的特征融合方法	zhihu
DCANet: Learning Connected Attentions for Convolutional Neural Networks	CoRR20	DCANet	增强attention之间信息流动
An Empirical Study of Spatial Attention Mechanisms in Deep Networks	ICCV19	None	对空间注意力进行针对性分析
Look closer to see better: Recurrent attention convolutional neural network for fine-grained image recognition	CVPR17 Oral	RA-CNN	细粒度识别
Guided Attention Network for Object Detection and Counting on Drones	ACM MM20	GANet	处理目标检测问题
Attention Augmented Convolutional Networks	ICCV19	AANet	多头+引入额外特征映射
GLOBAL SELF-ATTENTION NETWORKS FOR IMAGE RECOGNITION	ICLR21	GSA	新的全局注意力模块
Attention-Guided Hierarchical Structure Aggregation for Image Matting	CVPR20	HAttMatting	抠图方面的应用，高层使用通道注意力机制，然后再使用空间注意力机制指导低层。
Weight Excitation: Built-in Attention Mechanisms in Convolutional Neural Networks	ECCV20	None	与SE互补的权值激活机制
Expectation-Maximization Attention Networks for Semantic Segmentation	ICCV19 Oral	EMANet	EM+Attention

Plug and Play Module

ACBlock
Swish、wish Activation
ASPP Block
DepthWise Convolution
Fused Conv & BN
MixedDepthwise Convolution
PSP Module
RFBModule
SematicEmbbedBlock
SSH Context Module
Some other usefull tools such as concate feature map、flatten feature map
WeightedFeatureFusion:EfficientDet中的FPN用到的fuse方式
StripPooling：CVPR2020中核心代码StripPooling
GhostModule: CVPR2020GhostNet的核心模块
SlimConv: SlimConv3x3
Context Gating： video classification
EffNetBlock: EffNet
ECCV2020 BorderDet: Border aligment module
CVPR2019 DANet: Dual Attention
Object Contextual Representation for sematic segmentation: OCRModule
FPT: 包含Self Transform、Grounding Transform、Rendering Transform
DOConv: 阿里提出的Depthwise Over-parameterized Convolution
PyConv: 起源人工智能研究院提出的金字塔卷积
ULSAM：用于紧凑型CNN的超轻量级子空间注意力模块
DGC: ECCV 2020用于加速卷积神经网络的动态分组卷积
DCANet: ECCV 2020 学习卷积神经网络的连接注意力
PSConv: ECCV 2020 将特征金字塔压缩到紧凑的多尺度卷积层中
Dynamic Convolution: CVPR2020 动态滤波器卷积（非官方）
CondConv: Conditionally Parameterized Convolutions for Efficient Inference

Evaluation

基于CIFAR10+ResNet+待测评模块，对模块进行初步测评。测评代码来自于另外一个库：https://github.com/kuangliu/pytorch-cifar/ 实验过程中，不使用预训练权重，进行随机初始化。

模型	top1 acc	time	params(MB)
SENet18	95.28%	1:27:50	11,260,354
ResNet18	95.16%	1:13:03	11,173,962
ResNet50	95.50%	4:24:38	23,520,842
ShuffleNetV2	91.90%	1:02:50	1,263,854
GoogLeNet	91.90%	1:02:50	6,166,250
MobileNetV2	92.66%	2:04:57	2,296,922
SA-ResNet50	89.83%	2:10:07	23,528,758
SA-ResNet18	95.07%	1:39:38	11,171,394

Paper List

SENet 论文: https://arxiv.org/abs/1709.01507 解读：https://zhuanlan.zhihu.com/p/102035721

Contribute

欢迎在issue中提出补充的文章paper和对应code链接。

计算机视觉中用到的注意力模块和其他即插即用模块PyTorch Implementation Collection of Attention Module and Plug&Play Module

Related tags

Overview

Awesome-Attention-Mechanism-in-cv

Table of Contents

Introduction

Attention Mechanism

Plug and Play Module

Evaluation

Paper List

Contribute

Owner

PJDong

Official implementation of CVPR2020 paper "Deep Generative Model for Robust Imbalance Classification"

Lip Reading - Cross Audio-Visual Recognition using 3D Convolutional Neural Networks

Modified fork of Xuebin Qin's U-2-Net Repository. Used for demonstration purposes.

Official implementation of the network presented in the paper "M4Depth: A motion-based approach for monocular depth estimation on video sequences"

Smart edu-autobooking - Johnson @ DMI-UNICT study room self-booking system

Code release for "COTR: Correspondence Transformer for Matching Across Images"

MobileNetV1-V2，MobileNeXt，GhostNet，AdderNet，ShuffleNetV1-V2，Mobile+ViT etc.

Python scripts form performing stereo depth estimation using the high res stereo model in PyTorch .

Code base of object detection

[BMVC2021] The official implementation of "DomainMix: Learning Generalizable Person Re-Identification Without Human Annotations"

Tooling for converting STAC metadata to ODC data model

An adaptive hierarchical energy management strategy for hybrid electric vehicles

这是一个利用facenet和retinaface实现人脸识别的库，可以进行在线的人脸识别。

FLAVR is a fast, flow-free frame interpolation method capable of single shot multi-frame prediction

A curated list of awesome resources related to Semantic Search🔎 and Semantic Similarity tasks.

Implementation of Google Brain's WaveGrad high-fidelity vocoder

VID-Fusion: Robust Visual-Inertial-Dynamics Odometry for Accurate External Force Estimation

An University Project of Quera Web Crawling.

Exploring Versatile Prior for Human Motion via Motion Frequency Guidance (3DV2021)

Code for "Learning the Best Pooling Strategy for Visual Semantic Embedding", CVPR 2021