Deep Networks with Recurrent Layer Aggregation

Related tags

Deep LearningRLANet
Overview

RLA-Net: Recurrent Layer Aggregation

Recurrence along Depth: Deep Networks with Recurrent Layer Aggregation

This is an implementation of RLA-Net (accept by NeurIPS-2021, paper).

RLANet

Introduction

This paper introduces a concept of layer aggregation to describe how information from previous layers can be reused to better extract features at the current layer. While DenseNet is a typical example of the layer aggregation mechanism, its redundancy has been commonly criticized in the literature. This motivates us to propose a very light-weighted module, called recurrent layer aggregation (RLA), by making use of the sequential structure of layers in a deep CNN. Our RLA module is compatible with many mainstream deep CNNs, including ResNets, Xception and MobileNetV2, and its effectiveness is verified by our extensive experiments on image classification, object detection and instance segmentation tasks. Specifically, improvements can be uniformly observed on CIFAR, ImageNet and MS COCO datasets, and the corresponding RLA-Nets can surprisingly boost the performances by 2-3% on the object detection task. This evidences the power of our RLA module in helping main CNNs better learn structural information in images.

RLA module

RLA_module

Changelog

  • 2021/04/06 Upload RLA-ResNet model.
  • 2021/04/16 Upload RLA-MobileNetV2 (depthwise separable conv version) model.
  • 2021/09/29 Upload all the ablation study on ImageNet.
  • 2021/09/30 Upload mmdetection files.
  • 2021/10/01 Upload pretrained weights.

Installation

Requirements

Our environments

  • OS: Linux Red Hat 4.8.5
  • CUDA: 10.2
  • Toolkit: Python 3.8.5, PyTorch 1.7.0, torchvision 0.8.1
  • GPU: Tesla V100

Please refer to get_started.md for more details about installation.

Quick Start

Train with ResNet

- Use single node or multi node with multiple GPUs

Use multi-processing distributed training to launch N processes per node, which has N GPUs. This is the fastest way to use PyTorch for either single node or multi node data parallel training.

python train.py -a {model_name} --b {batch_size} --multiprocessing-distributed --world-size 1 --rank 0 {imagenet-folder with train and val folders}

- Specify single GPU or multiple GPUs

CUDA_VISIBLE_DEVICES={device_ids} python train.py -a {model_name} --b {batch_size} --multiprocessing-distributed --world-size 1 --rank 0 {imagenet-folder with train and val folders}

Testing

To evaluate the best model

python train.py -a {model_name} --b {batch_size} --multiprocessing-distributed --world-size 1 --rank 0 --resume {path to the best model} -e {imagenet-folder with train and val folders}

Visualizing the training result

To generate acc_plot, loss_plot

python eval_visual.py --log-dir {log_folder}

Train with MobileNet_v2

It is same with above ResNet replace train.py by train_light.py.

Compute the parameters and FLOPs

If you have install thop, you can paras_flops.py to compute the parameters and FLOPs of our models. The usage is below:

python paras_flops.py -a {model_name}

More examples are shown in examples.md.

MMDetection

After installing MMDetection (see get_started.md), then do the following steps:

  • put the file resnet_rla.py in the folder './mmdetection/mmdet/models/backbones/', and do not forget to import the model in the init.py file.
  • put the config files (e.g. faster_rcnn_r50rla_fpn.py) in the folder './mmdetection/configs/base/models/'
  • put the config files (e.g. faster_rcnn_r50rla_fpn_1x_coco.py) in the folder './mmdetection/configs/faster_rcnn'

Note that the config files of the latest version of MMDetection are a little different, please modify the config files according to the latest format.

Experiments

ImageNet

Model Param. FLOPs Top-1 err.(%) Top-5 err.(%) BaiduDrive(models) Extract code GoogleDrive
RLA-ResNet50 24.67M 4.17G 22.83 6.58 resnet50_rla_2283 5lf1 resnet50_rla_2283
RLA-ECANet50 24.67M 4.18G 22.15 6.11 ecanet50_rla_2215 xrfo ecanet50_rla_2215
RLA-ResNet101 42.92M 7.79G 21.48 5.80 resnet101_rla_2148 zrv5 resnet101_rla_2148
RLA-ECANet101 42.92M 7.80G 21.00 5.51 ecanet101_rla_2100 vhpy ecanet101_rla_2100
RLA-MobileNetV2 3.46M 351.8M 27.62 9.18 dsrla_mobilenetv2_k32_2762 g1pm dsrla_mobilenetv2_k32_2762
RLA-ECA-MobileNetV2 3.46M 352.4M 27.07 8.89 dsrla_mobilenetv2_k32_eca_2707 9orl dsrla_mobilenetv2_k32_eca_2707

COCO 2017

Model AP AP_50 AP_75 BaiduDrive(models) Extract code GoogleDrive
Fast_R-CNN_resnet50_rla 38.8 59.6 42.0 faster_rcnn_r50rla_fpn_1x_coco_388 q5c8 faster_rcnn_r50rla_fpn_1x_coco_388
Fast_R-CNN_ecanet50_rla 39.8 61.2 43.2 faster_rcnn_r50rlaeca_fpn_1x_coco_398 f5xs faster_rcnn_r50rlaeca_fpn_1x_coco_398
Fast_R-CNN_resnet101_rla 41.2 61.8 44.9 faster_rcnn_r101rla_fpn_1x_coco_412 0ri3 faster_rcnn_r101rla_fpn_1x_coco_412
Fast_R-CNN_ecanet101_rla 42.1 63.3 46.1 faster_rcnn_r101rlaeca_fpn_1x_coco_421 cpug faster_rcnn_r101rlaeca_fpn_1x_coco_421
RetinaNet_resnet50_rla 37.9 57.0 40.8 retinanet_r50rla_fpn_1x_coco_379 lahj retinanet_r50rla_fpn_1x_coco_379
RetinaNet_ecanet50_rla 39.0 58.7 41.7 retinanet_r50rlaeca_fpn_1x_coco_390 adyd retinanet_r50rlaeca_fpn_1x_coco_390
RetinaNet_resnet101_rla 40.3 59.8 43.5 retinanet_r101rla_fpn_1x_coco_403 p8y0 retinanet_r101rla_fpn_1x_coco_403
RetinaNet_ecanet101_rla 41.5 61.6 44.4 retinanet_r101rlaeca_fpn_1x_coco_415 hdqx retinanet_r101rlaeca_fpn_1x_coco_415
Mask_R-CNN_resnet50_rla 39.5 60.1 43.3 mask_rcnn_r50rla_fpn_1x_coco_395 j1x6 mask_rcnn_r50rla_fpn_1x_coco_395
Mask_R-CNN_ecanet50_rla 40.6 61.8 44.0 mask_rcnn_r50rlaeca_fpn_1x_coco_406 c08r mask_rcnn_r50rlaeca_fpn_1x_coco_406
Mask_R-CNN_resnet101_rla 41.8 62.3 46.2 mask_rcnn_r101rla_fpn_1x_coco_418 8bsn mask_rcnn_r101rla_fpn_1x_coco_418
Mask_R-CNN_ecanet101_rla 42.9 63.6 46.9 mask_rcnn_r101rlaeca_fpn_1x_coco_429 3kmz mask_rcnn_r101rlaeca_fpn_1x_coco_429

Citation

@misc{zhao2021recurrence,
      title={Recurrence along Depth: Deep Convolutional Neural Networks with Recurrent Layer Aggregation}, 
      author={Jingyu Zhao and Yanwen Fang and Guodong Li},
      year={2021},
      eprint={2110.11852},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Questions

Please contact '[email protected]' or '[email protected]'.

Owner
Joy Fang
Joy Fang
NBEATSx: Neural basis expansion analysis with exogenous variables

NBEATSx: Neural basis expansion analysis with exogenous variables We extend the NBEATS model to incorporate exogenous factors. The resulting method, c

Cristian Challu 100 Dec 31, 2022
The Most Efficient Temporal Difference Learning Framework for 2048

moporgic/TDL2048+ TDL2048+ is a highly optimized temporal difference (TD) learning framework for 2048. Features Many common methods related to 2048 ar

Hung Guei 5 Nov 23, 2022
🤗 Paper Style Guide

🤗 Paper Style Guide (Work in progress, send a PR!) Libraries to Know booktabs natbib cleveref Either seaborn, plotly or altair for graphs algorithmic

Hugging Face 66 Dec 12, 2022
Cupytorch - A small framework mimics PyTorch using CuPy or NumPy

CuPyTorch CuPyTorch是一个小型PyTorch,名字来源于: 不同于已有的几个使用NumPy实现PyTorch的开源项目,本项目通过CuPy支持

Xingkai Yu 23 Aug 17, 2022
Dewarping Document Image By Displacement Flow Estimation with Fully Convolutional Network.

Dewarping Document Image By Displacement Flow Estimation with Fully Convolutional Network

111 Dec 27, 2022
Intel® Nervana™ reference deep learning framework committed to best performance on all hardware

DISCONTINUATION OF PROJECT. This project will no longer be maintained by Intel. Intel will not provide or guarantee development of or support for this

Nervana 3.9k Dec 20, 2022
A Dataset of Python Challenges for AI Research

Python Programming Puzzles (P3) This repo contains a dataset of python programming puzzles which can be used to teach and evaluate an AI's programming

Microsoft 850 Dec 24, 2022
Java and SHACL code commented in the paper "Towards compliance checking in reified I/O logic via SHACL" submitted to ICAIL 2021

shRIOL The subfolder shRIOL contains Java files to execute the SHACL files on the OWL ontology. To compile the Java files: "javac -cp ./src/;./lib/* -

1 Dec 06, 2022
GeDML is an easy-to-use generalized deep metric learning library

GeDML is an easy-to-use generalized deep metric learning library

Borui Zhang 32 Dec 05, 2022
Official repository for the ISBI 2021 paper Transformer Assisted Convolutional Neural Network for Cell Instance Segmentation

SegPC-2021 This is the official repository for the ISBI 2021 paper Transformer Assisted Convolutional Neural Network for Cell Instance Segmentation by

Datascience IIT-ISM 13 Dec 14, 2022
Head2Toe: Utilizing Intermediate Representations for Better OOD Generalization

Head2Toe: Utilizing Intermediate Representations for Better OOD Generalization Code for reproducing our results in the Head2Toe paper. Paper: arxiv.or

Google Research 62 Dec 12, 2022
A framework for GPU based high-performance medical image processing and visualization

FAST is an open-source cross-platform framework with the main goal of making it easier to do high-performance processing and visualization of medical images on heterogeneous systems utilizing both mu

Erik Smistad 315 Dec 30, 2022
Network Pruning That Matters: A Case Study on Retraining Variants (ICLR 2021)

Network Pruning That Matters: A Case Study on Retraining Variants (ICLR 2021)

Duong H. Le 18 Jun 13, 2022
Implementation of ICCV21 paper: PnP-DETR: Towards Efficient Visual Analysis with Transformers

Implementation of ICCV 2021 paper: PnP-DETR: Towards Efficient Visual Analysis with Transformers arxiv This repository is based on detr Recently, DETR

twang 113 Dec 27, 2022
A list of all papers and resoureces on Semantic Segmentation

Semantic-Segmentation A list of all papers and resoureces on Semantic Segmentation. Dataset importance SemanticSegmentation_DL Some implementation of

Alan Tang 1.1k Dec 12, 2022
A PaddlePaddle version of Neural Renderer, refer to its PyTorch version

Neural 3D Mesh Renderer in PadddlePaddle A PaddlePaddle version of Neural Renderer, refer to its PyTorch version Install Run: pip install neural-rende

AgentMaker 13 Jul 12, 2022
CLIP+FFT text-to-image

Aphantasia This is a text-to-image tool, part of the artwork of the same name. Based on CLIP model, with FFT parameterizer from Lucent library as a ge

vadim epstein 690 Jan 02, 2023
The codes and models in 'Gaze Estimation using Transformer'.

GazeTR We provide the code of GazeTR-Hybrid in "Gaze Estimation using Transformer". We recommend you to use data processing codes provided in GazeHub.

65 Dec 27, 2022
This is the 3D Implementation of 《Inconsistency-aware Uncertainty Estimation for Semi-supervised Medical Image Segmentation》

CoraNet This is the 3D Implementation of 《Inconsistency-aware Uncertainty Estimation for Semi-supervised Medical Image Segmentation》 Environment pytor

25 Nov 08, 2022
An OpenAI Gym environment for Super Mario Bros

gym-super-mario-bros An OpenAI Gym environment for Super Mario Bros. & Super Mario Bros. 2 (Lost Levels) on The Nintendo Entertainment System (NES) us

Andrew Stelmach 1 Jan 05, 2022