[CVPR 2021] Involution: Inverting the Inherence of Convolution for Visual Recognition, a brand new neural operator

Overview

involution

Official implementation of a neural operator as described in Involution: Inverting the Inherence of Convolution for Visual Recognition (CVPR'21)

By Duo Li, Jie Hu, Changhu Wang, Xiangtai Li, Qi She, Lei Zhu, Tong Zhang, and Qifeng Chen

TL; DR. involution is a general-purpose neural primitive that is versatile for a spectrum of deep learning models on different vision tasks. involution bridges convolution and self-attention in design, while being more efficient and effective than convolution, simpler than self-attention in form.

Getting Started

This repository is fully built upon the OpenMMLab toolkits. For each individual task, the config and model files follow the same directory organization as mmcls, mmdet, and mmseg respectively, so just copy-and-paste them to the corresponding locations to get started.

For example, in terms of evaluating detectors

git clone https://github.com/open-mmlab/mmdetection # and install

cp det/mmdet/models/backbones/* mmdetection/mmdet/models/backbones
cp det/mmdet/models/necks/* mmdetection/mmdet/models/necks
cp det/mmdet/models/utils/* mmdetection/mmdet/models/utils

cp det/configs/_base_/models/* mmdetection/mmdet/configs/_base_/models
cp det/configs/_base_/schedules/* mmdetection/mmdet/configs/_base_/schedules
cp det/configs/involution mmdetection/mmdet/configs -r

cd mmdetection
# evaluate checkpoints
bash tools/dist_test.sh ${CONFIG_FILE} ${CHECKPOINT_FILE} ${GPU_NUM} [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}]

For more detailed guidance, please refer to the original mmcls, mmdet, and mmseg tutorials.

Currently, we provide an memory-efficient implementation of the involuton operator based on CuPy. Please install this library in advance. A customized CUDA kernel would bring about further acceleration on the hardware. Any contribution from the community regarding this is welcomed!

Model Zoo

The parameters/FLOPs↓ and performance↑ compared to the convolution baselines are marked in the parentheses. Part of these checkpoints are obtained in our reimplementation runs, whose performance may show slight differences with those reported in our paper. Models are trained with 64 GPUs on ImageNet, 8 GPUs on COCO, and 4 GPUs on Cityscapes.

Image Classification on ImageNet

Model Params(M) FLOPs(G) Top-1 (%) Top-5 (%) Config Download
RedNet-26 9.23(32.8%↓) 1.73(29.2%↓) 75.96 93.19 config model | log
RedNet-38 12.39(36.7%↓) 2.22(31.3%↓) 77.48 93.57 config model | log
RedNet-50 15.54(39.5%↓) 2.71(34.1%↓) 78.35 94.13 config model | log
RedNet-101 25.65(42.6%↓) 4.74(40.5%↓) 78.92 94.35 config model | log
RedNet-152 33.99(43.5%↓) 6.79(41.4%↓) 79.12 94.38 config model | log

Before finetuning on the following downstream tasks, download the ImageNet pre-trained RedNet-50 weights and set the pretrained argument in det/configs/_base_/models/*.py or seg/configs/_base_/models/*.py to your local path.

Object Detection and Instance Segmentation on COCO

Faster R-CNN

Backbone Neck Style Lr schd Params(M) FLOPs(G) box AP Config Download
RedNet-50-FPN convolution pytorch 1x 31.6(23.9%↓) 177.9(14.1%↓) 39.5(1.8↑) config model | log
RedNet-50-FPN involution pytorch 1x 29.5(28.9%↓) 135.0(34.8%↓) 40.2(2.5↑) config model | log

Mask R-CNN

Backbone Neck Style Lr schd Params(M) FLOPs(G) box AP mask AP Config Download
RedNet-50-FPN convolution pytorch 1x 34.2(22.6%↓) 224.2(11.5%↓) 39.9(1.5↑) 35.7(0.8↑) config model | log
RedNet-50-FPN involution pytorch 1x 32.2(27.1%↓) 181.3(28.5%↓) 40.8(2.4↑) 36.4(1.3↑) config model | log

RetinaNet

Backbone Neck Style Lr schd Params(M) FLOPs(G) box AP Config Download
RedNet-50-FPN convolution pytorch 1x 27.8(26.3%↓) 210.1(12.2%↓) 38.2(1.6↑) config model | log
RedNet-50-FPN involution pytorch 1x 26.3(30.2%↓) 199.9(16.5%↓) 38.2(1.6↑) config model | log

Semantic Segmentation on Cityscapes

Method Backbone Neck Crop Size Lr schd Params(M) FLOPs(G) mIoU Config download
FPN RedNet-50 convolution 512x1024 80000 18.5(35.1%↓) 293.9(19.0%↓) 78.0(3.6↑) config model | log
FPN RedNet-50 involution 512x1024 80000 16.4(42.5%↓) 205.2(43.4%↓) 79.1(4.7↑) config model | log
UPerNet RedNet-50 convolution 512x1024 80000 56.4(15.1%↓) 1825.6(3.6%↓) 80.6(2.4↑) config model | log

Citation

If you find our work useful in your research, please cite:

@InProceedings{Li_2021_CVPR,
author = {Li, Duo and Hu, Jie and Wang, Changhu and Li, Xiangtai and She, Qi and Zhu, Lei and Zhang, Tong and Chen, Qifeng},
title = {Involution: Inverting the Inherence of Convolution for Visual Recognition},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2021}
}
Deep Learning (with PyTorch)

Deep Learning (with PyTorch) This notebook repository now has a companion website, where all the course material can be found in video and textual for

Alfredo Canziani 6.2k Jan 07, 2023
Official Pytorch implementation of Scene Representation Networks: Continuous 3D-Structure-Aware Neural Scene Representations

Scene Representation Networks This is the official implementation of the NeurIPS submission "Scene Representation Networks: Continuous 3D-Structure-Aw

Vincent Sitzmann 365 Jan 06, 2023
A Structured Self-attentive Sentence Embedding

Structured Self-attentive sentence embeddings Implementation for the paper A Structured Self-Attentive Sentence Embedding, which was published in ICLR

Kaushal Shetty 488 Nov 28, 2022
Optimal space decomposition based-product quantization for approximate nearest neighbor search

Optimal space decomposition based-product quantization for approximate nearest neighbor search Abstract Product quantization(PQ) is an effective neare

Mylove 1 Nov 19, 2021
We are More than Our JOints: Predicting How 3D Bodies Move

We are More than Our JOints: Predicting How 3D Bodies Move Citation This repo contains the official implementation of our paper MOJO: @inproceedings{Z

72 Oct 20, 2022
Official PyTorch implementation of GDWCT (CVPR 2019, oral)

This repository provides the official code of GDWCT, and it is written in PyTorch. Paper Image-to-Image Translation via Group-wise Deep Whitening-and-

WonwoongCho 135 Dec 02, 2022
Determined: Deep Learning Training Platform

Determined: Deep Learning Training Platform Determined is an open-source deep learning training platform that makes building models fast and easy. Det

Determined AI 2k Dec 31, 2022
Official implementation of the Implicit Behavioral Cloning (IBC) algorithm

Implicit Behavioral Cloning This codebase contains the official implementation of the Implicit Behavioral Cloning (IBC) algorithm from our paper: Impl

Google Research 210 Dec 09, 2022
Implementation of our paper "DMT: Dynamic Mutual Training for Semi-Supervised Learning"

DMT: Dynamic Mutual Training for Semi-Supervised Learning This repository contains the code for our paper DMT: Dynamic Mutual Training for Semi-Superv

Zhengyang Feng 120 Dec 30, 2022
DetCo: Unsupervised Contrastive Learning for Object Detection

DetCo: Unsupervised Contrastive Learning for Object Detection arxiv link News Sparse RCNN+DetCo improves from 45.0 AP to 46.5 AP(+1.5) with 3x+ms trai

Enze Xie 234 Dec 18, 2022
MAU: A Motion-Aware Unit for Video Prediction and Beyond, NeurIPS2021

MAU (NeurIPS2021) Zheng Chang, Xinfeng Zhang, Shanshe Wang, Siwei Ma, Yan Ye, Xinguang Xiang, Wen GAo. Official PyTorch Code for "MAU: A Motion-Aware

ZhengChang 20 Nov 25, 2022
Deep Learning as a Cloud API Service.

Deep API Deep Learning as Cloud APIs. This project provides pre-trained deep learning models as a cloud API service. A web interface is available as w

Wu Han 4 Jan 06, 2023
Code for paper Decoupled Dynamic Spatial-Temporal Graph Neural Network for Traffic Forecasting

Decoupled Spatial-Temporal Graph Neural Networks Code for our paper: Decoupled Dynamic Spatial-Temporal Graph Neural Network for Traffic Forecasting.

S22 43 Jan 04, 2023
The Malware Open-source Threat Intelligence Family dataset contains 3,095 disarmed PE malware samples from 454 families

MOTIF Dataset The Malware Open-source Threat Intelligence Family (MOTIF) dataset contains 3,095 disarmed PE malware samples from 454 families, labeled

Booz Allen Hamilton 112 Dec 13, 2022
Implementation for "Domain-Specific Bias Filtering for Single Labeled Domain Generalization"

DSBF Introduction This repository contains the implementation code for paper: Domain-Specific Bias Filtering for Single Labeled Domain Generalization

ScottYuan 7 Jan 05, 2023
CRISCE: Automatically Generating Critical Driving Scenarios From Car Accident Sketches

CRISCE: Automatically Generating Critical Driving Scenarios From Car Accident Sketches This document describes how to install and use CRISCE (CRItical

Chair of Software Engineering II, Uni Passau 2 Feb 09, 2022
A Comprehensive Study on Learning-Based PE Malware Family Classification Methods

A Comprehensive Study on Learning-Based PE Malware Family Classification Methods Datasets Because of copyright issues, both the MalwareBazaar dataset

8 Oct 21, 2022
[ArXiv 2021] One-Shot Generative Domain Adaptation

GenDA - One-Shot Generative Domain Adaptation One-Shot Generative Domain Adaptation Ceyuan Yang*, Yujun Shen*, Zhiyi Zhang, Yinghao Xu, Jiapeng Zhu, Z

GenForce: May Generative Force Be with You 46 Dec 19, 2022
Using python and scikit-learn to make stock predictions

MachineLearningStocks in python: a starter project and guide EDIT as of Feb 2021: MachineLearningStocks is no longer actively maintained MachineLearni

Robert Martin 1.3k Dec 29, 2022
《Dual-Resolution Correspondence Network》(NeurIPS 2020)

Dual-Resolution Correspondence Network Dual-Resolution Correspondence Network, NeurIPS 2020 Dependency All dependencies are included in asset/dualrcne

Active Vision Laboratory 45 Nov 21, 2022