Summary of related papers on visual attention

Overview

This repo is built for paper: Attention Mechanisms in Computer Vision: A Survey paper

image

🔥 (citations > 200)

  • TODO : Code about different attention mechanisms will come soon.
  • TODO : Code link will come soon.
  • TODO : collect more related papers. Contributions are welcome.

Channel attention

  • Squeeze-and-Excitation Networks(CVPR2018) pdf, (PAMI2019 version) pdf 🔥
  • Image superresolution using very deep residual channel attention networks(ECCV2018) pdf 🔥
  • Context encoding for semantic segmentation(CVPR2018) pdf 🔥
  • Spatio-temporal channel correlation networks for action classification(ECCV2018) pdf
  • Global second-order pooling convolutional networks(CVPR2019) pdf
  • Srm : A style-based recalibration module for convolutional neural networks(ICCV2019) pdf
  • You look twice: Gaternet for dynamic filter selection in cnns(CVPR2019) pdf
  • Second-order attention network for single image super-resolution(CVPR2019) pdf 🔥
  • Spsequencenet: Semantic segmentation network on 4d point clouds(CVPR2020) pdf
  • Ecanet: Efficient channel attention for deep convolutional neural networks (CVPR2020) pdf 🔥
  • Gated channel transformation for visual recognition(CVPR2020) pdf
  • Fcanet: Frequency channel attention networks(ICCV2021) pdf

Spatial attention

  • Recurrent models of visual attention(NeurIPS2014), pdf 🔥
  • Show, attend and tell: Neural image caption generation with visual attention(PMLR2015) pdf 🔥
  • Draw: A recurrent neural network for image generation(ICML2015) pdf 🔥
  • Spatial transformer networks(NeurIPS2015) pdf 🔥
  • Multiple object recognition with visual attention(ICLR2015) pdf 🔥
  • Action recognition using visual attention(arXiv2015) pdf 🔥
  • Videolstm convolves, attends and flows for action recognition(arXiv2016) pdf 🔥
  • Look closer to see better: Recurrent attention convolutional neural network for fine-grained image recognition(CVPR2017) pdf 🔥
  • Learning multi-attention convolutional neural network for fine-grained image recognition(ICCV2017) pdf 🔥
  • Diversified visual attention networks for fine-grained object classification(TMM2017) pdf 🔥
  • Attentional pooling for action recognition(NeurIPS2017) pdf 🔥
  • Non-local neural networks(CVPR2018) pdf 🔥
  • Attentional shapecontextnet for point cloud recognition(CVPR2018) pdf
  • Relation networks for object detection(CVPR2018) pdf 🔥
  • a2-nets: Double attention networks(NeurIPS2018) pdf 🔥
  • Attention-aware compositional network for person re-identification(CVPR2018) pdf 🔥
  • Tell me where to look: Guided attention inference network(CVPR2018) pdf 🔥
  • Pedestrian alignment network for large-scale person re-identification(TCSVT2018) pdf 🔥
  • Learn to pay attention(ICLR2018) pdf 🔥
  • Attention U-Net: Learning Where to Look for the Pancreas(MIDL2018) pdf 🔥
  • Psanet: Point-wise spatial attention network for scene parsing(ECCV2018) pdf 🔥
  • Self attention generative adversarial networks(ICML2019) pdf 🔥
  • Attentional pointnet for 3d-object detection in point clouds(CVPRW2019) pdf
  • Co-occurrent features in semantic segmentation(CVPR2019) pdf
  • Attention augmented convolutional networks(ICCV2019) pdf 🔥
  • Local relation networks for image recognition(ICCV2019) pdf
  • Latentgnn: Learning efficient nonlocal relations for visual recognition(ICML2019) pdf
  • Graph-based global reasoning networks(CVPR2019) pdf 🔥
  • Gcnet: Non-local networks meet squeeze-excitation networks and beyond(ICCVW2019) pdf 🔥
  • Asymmetric non-local neural networks for semantic segmentation(ICCV2019) pdf 🔥
  • Looking for the devil in the details: Learning trilinear attention sampling network for fine-grained image recognition(CVPR2019) pdf
  • Second-order non-local attention networks for person re-identification(ICCV2019) pdf 🔥
  • End-to-end comparative attention networks for person re-identification(ICCV2019) pdf 🔥
  • Modeling point clouds with self-attention and gumbel subset sampling(CVPR2019) pdf
  • Diagnose like a radiologist: Attention guided convolutional neural network for thorax disease classification(arXiv 2019) pdf
  • L2g autoencoder: Understanding point clouds by local-to-global reconstruction with hierarchical self-attention(arXiv 2019) pdf
  • Generative pretraining from pixels(PMLR2020) pdf
  • Exploring self-attention for image recognition(CVPR2020) pdf
  • Cf-sis: Semantic-instance segmentation of 3d point clouds by context fusion with self attention(MM20) pdf
  • Disentangled non-local neural networks(ECCV2020) pdf
  • Relation-aware global attention for person re-identification(CVPR2020) pdf
  • Segmentation transformer: Object-contextual representations for semantic segmentation(ECCV2020) pdf 🔥
  • Spatial pyramid based graph reasoning for semantic segmentation(CVPR2020) pdf
  • Self-supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation(CVPR2020) pdf
  • End-to-end object detection with transformers(ECCV2020) pdf 🔥
  • Pointasnl: Robust point clouds processing using nonlocal neural networks with adaptive sampling(CVPR2020) pdf
  • Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers(CVPR2021) pdf
  • An image is worth 16x16 words: Transformers for image recognition at scale(ICLR2021) pdf 🔥
  • An empirical study of training selfsupervised vision transformers(CVPR2021) pdf
  • Ocnet: Object context network for scene parsing(IJCV 2021) pdf 🔥
  • Point transformer(ICCV 2021) pdf
  • PCT: Point Cloud Transformer (CVMJ 2021) pdf
  • Pre-trained image processing transformer(CVPR 2021) pdf
  • An empirical study of training self-supervised vision transformers(ICCV 2021) pdf
  • Segformer: Simple and efficient design for semantic segmentation with transformers(arxiv 2021) pdf
  • Beit: Bert pre-training of image transformers(arxiv 2021) pdf
  • Beyond selfattention: External attention using two linear layers for visual tasks(arxiv 2021) pdf
  • Query2label: A simple transformer way to multi-label classification(arxiv 2021) pdf
  • Transformer in transformer(arxiv 2021) pdf

Temporal attention

  • Jointly attentive spatial-temporal pooling networks for video-based person re-identification (ICCV 2017) pdf 🔥
  • Video person reidentification with competitive snippet-similarity aggregation and co-attentive snippet embedding(CVPR 2018) pdf
  • Scan: Self-and-collaborative attention network for video person re-identification (TIP 2019) pdf

Branch attention

  • Training very deep networks, (NeurIPS 2015) pdf 🔥
  • Selective kernel networks,(CVPR 2019) pdf 🔥
  • CondConv: Conditionally Parameterized Convolutions for Efficient Inference (NeurIPS 2019) pdf
  • Dynamic convolution: Attention over convolution kernels (CVPR 2020) pdf
  • ResNest: Split-attention networks (arXiv 2020) pdf 🔥

ChannelSpatial attention

  • Residual attention network for image classification (CVPR 2017) pdf 🔥
  • SCA-CNN: spatial and channel-wise attention in convolutional networks for image captioning,(CVPR 2017) pdf 🔥
  • CBAM: convolutional block attention module, (ECCV 2018) pdf 🔥
  • Harmonious attention network for person re-identification (CVPR 2018) pdf 🔥
  • Recalibrating fully convolutional networks with spatial and channel “squeeze and excitation” blocks (TMI 2018) pdf
  • Mancs: A multi-task attentional network with curriculum sampling for person re-identification (ECCV 2018) pdf 🔥
  • Bam: Bottleneck attention module(BMVC 2018) pdf 🔥
  • Pvnet: A joint convolutional network of point cloud and multi-view for 3d shape recognition (ACM MM 2018) pdf
  • Learning what and where to attend,(ICLR 2019) pdf
  • Dual attention network for scene segmentation (CVPR 2019) pdf 🔥
  • Abd-net: Attentive but diverse person re-identification (ICCV 2019) pdf
  • Mixed high-order attention network for person re-identification (ICCV 2019) pdf
  • Mlcvnet: Multi-level context votenet for 3d object detection (CVPR 2020) pdf
  • Improving convolutional networks with self-calibrated convolutions (CVPR 2020) pdf
  • Relation-aware global attention for person re-identification (CVPR 2020) pdf
  • Strip Pooling: Rethinking spatial pooling for scene parsing (CVPR 2020) pdf
  • Rotate to attend: Convolutional triplet attention module, (WACV 2021) pdf
  • Coordinate attention for efficient mobile network design (CVPR 2021) pdf
  • Simam: A simple, parameter-free attention module for convolutional neural networks (ICML 2021) pdf

SpatialTemporal attention

  • An end-to-end spatio-temporal attention model for human action recognition from skeleton data(AAAI 2017) pdf 🔥
  • Diversity regularized spatiotemporal attention for video-based person re-identification (ArXiv 2018) 🔥
  • Interpretable spatio-temporal attention for video action recognition (ICCVW 2019) pdf
  • Hierarchical lstms with adaptive attention for visual captioning, (TPAMI 2020) pdf
  • Stat: Spatial-temporal attention mechanism for video captioning, (TMM 2020) pdf_link
  • Gta: Global temporal attention for video action understanding (ArXiv 2020) pdf
  • Multi-granularity reference-aided attentive feature aggregation for video-based person re-identification (CVPR 2020) pdf
  • Read: Reciprocal attention discriminator for image-to-video re-identification, (ECCV 2020) pdf
  • Decoupled spatial-temporal transformer for video inpainting (ArXiv 2021) pdf
Owner
MenghaoGuo
Second-year Ph.D candidate at G2 group, Tsinghua University.
MenghaoGuo
This is the code related to "Sparse-to-dense Feature Matching: Intra and Inter domain Cross-modal Learning in Domain Adaptation for 3D Semantic Segmentation" (ICCV 2021).

Sparse-to-dense Feature Matching: Intra and Inter domain Cross-modal Learning in Domain Adaptation for 3D Semantic Segmentation This is the code relat

39 Sep 23, 2022
Earthquake detection via fiber optic cables using deep learning

Earthquake detection via fiber optic cables using deep learning Author: Fantine Huot Getting started Update the submodules After cloning the repositor

Fantine 4 Nov 30, 2022
Fast sparse deep learning on CPUs

SPARSEDNN **If you want to use this repo, please send me an email: [email pro

Ziheng Wang 44 Nov 30, 2022
Vision-and-Language Navigation in Continuous Environments using Habitat

Vision-and-Language Navigation in Continuous Environments (VLN-CE) Project Website — VLN-CE Challenge — RxR-Habitat Challenge Official implementations

Jacob Krantz 132 Jan 02, 2023
Code for the paper "Training GANs with Stronger Augmentations via Contrastive Discriminator" (ICLR 2021)

Training GANs with Stronger Augmentations via Contrastive Discriminator (ICLR 2021) This repository contains the code for reproducing the paper: Train

Jongheon Jeong 174 Dec 29, 2022
A simple library that implements CLIP guided loss in PyTorch.

pytorch_clip_guided_loss: Pytorch implementation of the CLIP guided loss for Text-To-Image, Image-To-Image, or Image-To-Text generation. A simple libr

Sergei Belousov 74 Dec 26, 2022
From Canonical Correlation Analysis to Self-supervised Graph Neural Networks

Code for CCA-SSG model proposed in the NeurIPS 2021 paper From Canonical Correlation Analysis to Self-supervised Graph Neural Networks.

Hengrui Zhang 44 Nov 27, 2022
Repositório criado para abrigar os notebooks com a listas de exercícios propostos pelo professor Gustavo Guanabara do canal Curso em Vídeo do YouTube durante o Curso de Python 3

Curso em Vídeo - Exercícios de Python 3 Sobre o repositório Este repositório contém os notebooks com a listas de exercícios propostos pelo professor G

João Pedro Pereira 9 Oct 15, 2022
minimizer-space de Bruijn graphs (mdBG) for whole genome assembly

rust-mdbg: Minimizer-space de Bruijn graphs (mdBG) for whole-genome assembly rust-mdbg is an ultra-fast minimizer-space de Bruijn graph (mdBG) impleme

Barış Ekim 148 Dec 01, 2022
Python implementation of a live deep learning based age/gender/expression recognizer

TUT live age estimator Python implementation of a live deep learning based age/gender/smile/celebrity twin recognizer. All components use convolutiona

Heikki Huttunen 80 Nov 21, 2022
High-Resolution Image Synthesis with Latent Diffusion Models

Latent Diffusion Models arXiv | BibTeX High-Resolution Image Synthesis with Latent Diffusion Models Robin Rombach*, Andreas Blattmann*, Dominik Lorenz

CompVis Heidelberg 5.6k Dec 30, 2022
Who calls the shots? Rethinking Few-Shot Learning for Audio (WASPAA 2021)

rethink-audio-fsl This repo contains the source code for the paper "Who calls the shots? Rethinking Few-Shot Learning for Audio." (WASPAA 2021) Table

Yu Wang 34 Dec 24, 2022
Deep Learning Emotion decoding using EEG data from Autism individuals

Deep Learning Emotion decoding using EEG data from Autism individuals This repository includes the python and matlab codes using for processing EEG 2D

Juan Manuel Mayor Torres 12 Dec 08, 2022
Conceptual 12M is a dataset containing (image-URL, caption) pairs collected for vision-and-language pre-training.

Conceptual 12M We introduce the Conceptual 12M (CC12M), a dataset with ~12 million image-text pairs meant to be used for vision-and-language pre-train

Google Research Datasets 226 Dec 07, 2022
This is the repo for Uncertainty Quantification 360 Toolkit.

UQ360 The Uncertainty Quantification 360 (UQ360) toolkit is an open-source Python package that provides a diverse set of algorithms to quantify uncert

International Business Machines 207 Dec 30, 2022
😮The official implementation of "CoNeRF: Controllable Neural Radiance Fields" 😮

CoNeRF: Controllable Neural Radiance Fields This is the official implementation for "CoNeRF: Controllable Neural Radiance Fields" Project Page Paper V

Kacper Kania 61 Dec 24, 2022
Code for the CVPR 2021 paper "Triple-cooperative Video Shadow Detection"

Triple-cooperative Video Shadow Detection Code and dataset for the CVPR 2021 paper "Triple-cooperative Video Shadow Detection"[arXiv link] [official l

Zhihao Chen 24 Oct 04, 2022
Near-Optimal Sparse Allreduce for Distributed Deep Learning (published in PPoPP'22)

Near-Optimal Sparse Allreduce for Distributed Deep Learning (published in PPoPP'22) Ok-Topk is a scheme for distributed training with sparse gradients

Shigang Li 9 Oct 29, 2022
TransPrompt - Towards an Automatic Transferable Prompting Framework for Few-shot Text Classification

TransPrompt This code is implement for our EMNLP 2021's paper 《TransPrompt:Towards an Automatic Transferable Prompting Framework for Few-shot Text Cla

WangJianing 23 Dec 21, 2022
StudioGAN is a Pytorch library providing implementations of representative Generative Adversarial Networks (GANs) for conditional/unconditional image generation.

StudioGAN is a Pytorch library providing implementations of representative Generative Adversarial Networks (GANs) for conditional/unconditional image generation.

3k Jan 08, 2023