Official pytorch implementation of paper "Inception Convolution with Efficient Dilation Search" (CVPR 2021 Oral).

Last update: Dec 31, 2022

Related tags

Deep Learning IC-Conv

Overview

IC-Conv

This repository is an official implementation of the paper Inception Convolution with Efficient Dilation Search.

Getting Started

Download ImageNet pre-trained checkpoints.

Extract the file to get the following directory tree

|-- README.md
|-- ckpt
|   |-- detection
|   |-- human_pose
|   |-- segmentation
|-- config
|-- model
|-- pattern_zoo

Easy Use

The current implementation is coupled to specific downstream tasks. OpenMMLab users can quickly use IC-Conv in the following simple ways.

from models import IC_ResNet
import torch
net = IC_ResNet(depth=50,pattern_path='pattern_zoo/detection/ic_r50_k9.json')
net.eval()
inputs = torch.rand(1, 3, 32, 32)
outputs = net.forward(inputs)

For 2d Human Pose Estimation using MMPose

Copying the config files to the config path of mmpose, such as

cp config/human_pose/ic_res50_k13_coco_640x640.py your_mmpose_path/mmpose/configs/bottom_up/resnet/coco/ic_res50_k13_coco_640x640.py

Copying the inception conv files to the model path of mmpose,

cp model/ic_conv2d.py your_mmpose_path/mmpose/mmpose/models/backbones/ic_conv2d.py
cp model/ic_resnet.py your_mmpose_path/mmpose/mmpose/models/backbones/ic_resnet.py

Running it directly like MMPose.

Model Zoo

We provided the pre-trained weights of IC-ResNet-50, IC-ResNet-101and IC-ResNeXt-101 (32x4d) on ImageNet and the weights trained on specific tasks.

For users with limited computing power, you can directly reuse our provided IC-Conv and ImageNet pre-training weights for detection, segmentation, and 2d human pose estimation tasks on other datasets.

Attentions: The links in the tables below are relative paths. Therefore, you should clone the repository and download checkpoints.

Object Detection

Detector	Backbone	Lr	AP	dilation_pattern	checkpoint
Faster-RCNN-FPN	IC-R50	1x	38.9	pattern	ckpt/imagenet_retrain_ckpt
Faster-RCNN-FPN	IC-R101	1x	41.9	pattern	ckpt/imagenet_retrain_ckpt
Faster-RCNN-FPN	IC-X101-32x4d	1x	42.1	pattern	ckpt/imagenet_retrain_ckpt
Cascade-RCNN-FPN	IC-R50	1x	42.4	pattern	ckpt/imagenet_retrain_ckpt
Cascade-RCNN-FPN	IC-R101	1x	45.0	pattern	ckpt/imagenet_retrain_ckpt
Cascade-RCNN-FPN	IC-X101-32x4d	1x	45.7	pattern	ckpt/imagenet_retrain_ckpt

Instance Segmentation

Detector	Backbone	Lr	box AP	mask AP	dilation_pattern	checkpoint
Mask-RCNN-FPN	IC-R50	1x	40.0	35.9	pattern	ckpt/imagenet_retrain_ckpt
Mask-RCNN-FPN	IC-R101	1x	42.6	37.9	pattern	ckpt/imagenet_retrain_ckpt
Mask-RCNN-FPN	IC-X101-32x4d	1x	43.4	38.4	pattern	ckpt/imagenet_retrain_ckpt
Cascade-RCNN-FPN	IC-R50	1x	43.4	36.8	pattern	ckpt/imagenet_retrain_ckpt
Cascade-RCNN-FPN	IC-R101	1x	45.7	38.7	pattern	ckpt/imagenet_retrain_ckpt
Cascade-RCNN-FPN	IC-X101-32x4d	1x	46.4	39.1	pattern	ckpt/imagenet_retrain_ckpt

2d Human Pose Estimation

We adjust the learning rate of resnet backbone in MMPose and get better baseline results. Please see the specific config files in config/human_pose/.

Results on COCO val2017 without multi-scale test

Backbone	Input Size	AP	dilation_pattern	checkpoint
R50(mmpose)	640x640	47.9	~	~
R50	640x640	51.0	~	~
IC-R50	640x640	62.2	pattern	ckpt/imagenet_retrain_ckpt
R101	640x640	55.5	~	~
IC-R101	640x640	63.3	pattern	ckpt/imagenet_retrain_ckpt

Results on COCO val2017 with multi-scale test. 3 default scales ([2, 1, 0.5]) are used

Backbone	Input Size	AP
R50(mmpose)	640x640	52.5
R50	640x640	55.8
IC-R50	640x640	65.8
R101	640x640	60.2
IC-R101	640x640	68.5

Acknowledgement

The human pose estimation experiments are built upon MMPose.

Citation

If our paper helps your research, please cite it in your publications:

@article{liu2020inception,
 title={Inception Convolution with Efficient Dilation Search},
 author={Liu, Jie and Li, Chuming and Liang, Feng and Lin, Chen and Sun, Ming and Yan, Junjie and Ouyang, Wanli and Xu, Dong},
 journal={arXiv preprint arXiv:2012.13587},
 year={2020}
}

Official pytorch implementation of paper "Inception Convolution with Efficient Dilation Search" (CVPR 2021 Oral).

Related tags

Overview

IC-Conv

Getting Started

Easy Use

For 2d Human Pose Estimation using MMPose

Model Zoo

Object Detection

Instance Segmentation

2d Human Pose Estimation

Results on COCO val2017 without multi-scale test

Results on COCO val2017 with multi-scale test. 3 default scales ([2, 1, 0.5]) are used

Acknowledgement

Citation

Owner

Jie Liu

Dataset used in "PlantDoc: A Dataset for Visual Plant Disease Detection" accepted in CODS-COMAD 2020

An official implementation of "SFNet: Learning Object-aware Semantic Correspondence" (CVPR 2019, TPAMI 2020) in PyTorch.

Learning with Noisy Labels via Sparse Regularization, ICCV2021

CLOCs: Camera-LiDAR Object Candidates Fusion for 3D Object Detection

SpecAugmentPyTorch - A Pytorch (support batch and channel) implementation of GoogleBrain's SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition

基于DouZero定制AI实战欢乐斗地主

TART - A PyTorch implementation for Transition Matrix Representation of Trees with Transposed Convolutions

Train Scene Graph Generation for Visual Genome and GQA in PyTorch >= 1.2 with improved zero and few-shot generalization.

Stacked Recurrent Hourglass Network for Stereo Matching

This project uses ViT to perform image classification tasks on DATA set CIFAR10.

Implementation of "The Power of Scale for Parameter-Efficient Prompt Tuning"

Rotary Transformer

RealFormer-Pytorch Implementation of RealFormer using pytorch

MTCNN face detection implementation for TensorFlow, as a PIP package.

PyTorch code for 'Efficient Single Image Super-Resolution Using Dual Path Connections with Multiple Scale Learning'

Implementation supporting the ICCV 2017 paper "GANs for Biological Image Synthesis"

BoxInst: High-Performance Instance Segmentation with Box Annotations

scAR (single-cell Ambient Remover) is a package for data denoising in single-cell omics.

Bayesian Optimization Library for Medical Image Segmentation.

Code to train models from "Paraphrastic Representations at Scale".