Locally Enhanced Self-Attention: Rethinking Self-Attention as Local and Context Terms

Last update: Dec 31, 2021

Related tags

Overview

LESA

Introduction

This repository contains the official implementation of Locally Enhanced Self-Attention: Rethinking Self-Attention as Local and Context Terms. The code for image classification and object detection is based on axial-deeplab and mmdetection.

Citing LESA

If you find LESA is helpful in your project, please consider citing our paper.

@article{yang2021locally,
  title={Locally Enhanced Self-Attention: Rethinking Self-Attention as Local and Context Terms},
  author={Yang, Chenglin and Qiao, Siyuan and Kortylewski, Adam and Yuille, Alan},
  journal={arXiv preprint arXiv:2107.05637},
  year={2021}
}

Main Results on ImageNet

Please refer to LESA_classification for details.

Method	Model	Top-1 Acc.	Top-5 Acc.
LESA_ResNet50	Download	79.55	94.79
LESA_WRN50	Download	80.18	95.07

Main Results on COCO test-dev

Please refer to LESA_detection for details.

Method	Backbone	Pretrained	Model	Box AP	Mask AP
Mask-RCNN	LESA_ResNet50	Download	Download	44.2	39.6
HTC	LESA_WRN50	Download	Download	50.5	44.4

Credits

This project is based on axial-deeplab and mmdetection.

Relative position embedding is based on bottleneck-transformer-pytorch

ResNet is based on pytorch/vision. Classification helper functions are based on pytorch-classification.

Locally Enhanced Self-Attention: Rethinking Self-Attention as Local and Context Terms

Related tags

Overview

LESA

Introduction

Citing LESA

Main Results on ImageNet

Main Results on COCO test-dev

Credits

Owner

Chenglin Yang

Implementation of Kronecker Attention in Pytorch

Implementation of DocFormer: End-to-End Transformer for Document Understanding, a multi-modal transformer based architecture for the task of Visual Document Understanding (VDU)

Designing a Minimal Retrieve-and-Read System for Open-Domain Question Answering (NAACL 2021)

Unified Pre-training for Self-Supervised Learning and Supervised Learning for ASR

GeoMol: Torsional Geometric Generation of Molecular 3D Conformer Ensembles

Face detection using deep learning.

Baleen: Robust Multi-Hop Reasoning at Scale via Condensed Retrieval (NeurIPS'21)

Finding all things on-prem Microsoft for password spraying and enumeration.

Attention over nodes in Graph Neural Networks using PyTorch (NeurIPS 2019)

Keras implementation of PersonLab for Multi-Person Pose Estimation and Instance Segmentation.

This is a code repository for paper OODformer: Out-Of-Distribution Detection Transformer

Project code for weakly supervised 3D object detectors using wide-baseline multi-view traffic camera data: WIBAM.

This Artificial Intelligence program can take a black and white/grayscale image and generate a realistic or plausible colorized version of the same picture.

Pytorch implementation of NeurIPS 2021 paper: Geometry Processing with Neural Fields.

Code for CVPR2021 paper "Robust Reflection Removal with Reflection-free Flash-only Cues"

An offline deep reinforcement learning library

DIRL: Domain-Invariant Representation Learning

An AI made using artificial intelligence (AI) and machine learning algorithms (ML) .

MlTr: Multi-label Classification with Transformer

Element selection for functional materials discovery by integrated machine learning of atomic contributions to properties