Deep Networks with Recurrent Layer Aggregation

Related tags

Deep LearningRLANet
Overview

RLA-Net: Recurrent Layer Aggregation

Recurrence along Depth: Deep Networks with Recurrent Layer Aggregation

This is an implementation of RLA-Net (accept by NeurIPS-2021, paper).

RLANet

Introduction

This paper introduces a concept of layer aggregation to describe how information from previous layers can be reused to better extract features at the current layer. While DenseNet is a typical example of the layer aggregation mechanism, its redundancy has been commonly criticized in the literature. This motivates us to propose a very light-weighted module, called recurrent layer aggregation (RLA), by making use of the sequential structure of layers in a deep CNN. Our RLA module is compatible with many mainstream deep CNNs, including ResNets, Xception and MobileNetV2, and its effectiveness is verified by our extensive experiments on image classification, object detection and instance segmentation tasks. Specifically, improvements can be uniformly observed on CIFAR, ImageNet and MS COCO datasets, and the corresponding RLA-Nets can surprisingly boost the performances by 2-3% on the object detection task. This evidences the power of our RLA module in helping main CNNs better learn structural information in images.

RLA module

RLA_module

Changelog

  • 2021/04/06 Upload RLA-ResNet model.
  • 2021/04/16 Upload RLA-MobileNetV2 (depthwise separable conv version) model.
  • 2021/09/29 Upload all the ablation study on ImageNet.
  • 2021/09/30 Upload mmdetection files.
  • 2021/10/01 Upload pretrained weights.

Installation

Requirements

Our environments

  • OS: Linux Red Hat 4.8.5
  • CUDA: 10.2
  • Toolkit: Python 3.8.5, PyTorch 1.7.0, torchvision 0.8.1
  • GPU: Tesla V100

Please refer to get_started.md for more details about installation.

Quick Start

Train with ResNet

- Use single node or multi node with multiple GPUs

Use multi-processing distributed training to launch N processes per node, which has N GPUs. This is the fastest way to use PyTorch for either single node or multi node data parallel training.

python train.py -a {model_name} --b {batch_size} --multiprocessing-distributed --world-size 1 --rank 0 {imagenet-folder with train and val folders}

- Specify single GPU or multiple GPUs

CUDA_VISIBLE_DEVICES={device_ids} python train.py -a {model_name} --b {batch_size} --multiprocessing-distributed --world-size 1 --rank 0 {imagenet-folder with train and val folders}

Testing

To evaluate the best model

python train.py -a {model_name} --b {batch_size} --multiprocessing-distributed --world-size 1 --rank 0 --resume {path to the best model} -e {imagenet-folder with train and val folders}

Visualizing the training result

To generate acc_plot, loss_plot

python eval_visual.py --log-dir {log_folder}

Train with MobileNet_v2

It is same with above ResNet replace train.py by train_light.py.

Compute the parameters and FLOPs

If you have install thop, you can paras_flops.py to compute the parameters and FLOPs of our models. The usage is below:

python paras_flops.py -a {model_name}

More examples are shown in examples.md.

MMDetection

After installing MMDetection (see get_started.md), then do the following steps:

  • put the file resnet_rla.py in the folder './mmdetection/mmdet/models/backbones/', and do not forget to import the model in the init.py file.
  • put the config files (e.g. faster_rcnn_r50rla_fpn.py) in the folder './mmdetection/configs/base/models/'
  • put the config files (e.g. faster_rcnn_r50rla_fpn_1x_coco.py) in the folder './mmdetection/configs/faster_rcnn'

Note that the config files of the latest version of MMDetection are a little different, please modify the config files according to the latest format.

Experiments

ImageNet

Model Param. FLOPs Top-1 err.(%) Top-5 err.(%) BaiduDrive(models) Extract code GoogleDrive
RLA-ResNet50 24.67M 4.17G 22.83 6.58 resnet50_rla_2283 5lf1 resnet50_rla_2283
RLA-ECANet50 24.67M 4.18G 22.15 6.11 ecanet50_rla_2215 xrfo ecanet50_rla_2215
RLA-ResNet101 42.92M 7.79G 21.48 5.80 resnet101_rla_2148 zrv5 resnet101_rla_2148
RLA-ECANet101 42.92M 7.80G 21.00 5.51 ecanet101_rla_2100 vhpy ecanet101_rla_2100
RLA-MobileNetV2 3.46M 351.8M 27.62 9.18 dsrla_mobilenetv2_k32_2762 g1pm dsrla_mobilenetv2_k32_2762
RLA-ECA-MobileNetV2 3.46M 352.4M 27.07 8.89 dsrla_mobilenetv2_k32_eca_2707 9orl dsrla_mobilenetv2_k32_eca_2707

COCO 2017

Model AP AP_50 AP_75 BaiduDrive(models) Extract code GoogleDrive
Fast_R-CNN_resnet50_rla 38.8 59.6 42.0 faster_rcnn_r50rla_fpn_1x_coco_388 q5c8 faster_rcnn_r50rla_fpn_1x_coco_388
Fast_R-CNN_ecanet50_rla 39.8 61.2 43.2 faster_rcnn_r50rlaeca_fpn_1x_coco_398 f5xs faster_rcnn_r50rlaeca_fpn_1x_coco_398
Fast_R-CNN_resnet101_rla 41.2 61.8 44.9 faster_rcnn_r101rla_fpn_1x_coco_412 0ri3 faster_rcnn_r101rla_fpn_1x_coco_412
Fast_R-CNN_ecanet101_rla 42.1 63.3 46.1 faster_rcnn_r101rlaeca_fpn_1x_coco_421 cpug faster_rcnn_r101rlaeca_fpn_1x_coco_421
RetinaNet_resnet50_rla 37.9 57.0 40.8 retinanet_r50rla_fpn_1x_coco_379 lahj retinanet_r50rla_fpn_1x_coco_379
RetinaNet_ecanet50_rla 39.0 58.7 41.7 retinanet_r50rlaeca_fpn_1x_coco_390 adyd retinanet_r50rlaeca_fpn_1x_coco_390
RetinaNet_resnet101_rla 40.3 59.8 43.5 retinanet_r101rla_fpn_1x_coco_403 p8y0 retinanet_r101rla_fpn_1x_coco_403
RetinaNet_ecanet101_rla 41.5 61.6 44.4 retinanet_r101rlaeca_fpn_1x_coco_415 hdqx retinanet_r101rlaeca_fpn_1x_coco_415
Mask_R-CNN_resnet50_rla 39.5 60.1 43.3 mask_rcnn_r50rla_fpn_1x_coco_395 j1x6 mask_rcnn_r50rla_fpn_1x_coco_395
Mask_R-CNN_ecanet50_rla 40.6 61.8 44.0 mask_rcnn_r50rlaeca_fpn_1x_coco_406 c08r mask_rcnn_r50rlaeca_fpn_1x_coco_406
Mask_R-CNN_resnet101_rla 41.8 62.3 46.2 mask_rcnn_r101rla_fpn_1x_coco_418 8bsn mask_rcnn_r101rla_fpn_1x_coco_418
Mask_R-CNN_ecanet101_rla 42.9 63.6 46.9 mask_rcnn_r101rlaeca_fpn_1x_coco_429 3kmz mask_rcnn_r101rlaeca_fpn_1x_coco_429

Citation

@misc{zhao2021recurrence,
      title={Recurrence along Depth: Deep Convolutional Neural Networks with Recurrent Layer Aggregation}, 
      author={Jingyu Zhao and Yanwen Fang and Guodong Li},
      year={2021},
      eprint={2110.11852},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Questions

Please contact '[email protected]' or '[email protected]'.

Owner
Joy Fang
Joy Fang
Assessing the Influence of Models on the Performance of Reinforcement Learning Algorithms applied on Continuous Control Tasks

Assessing the Influence of Models on the Performance of Reinforcement Learning Algorithms applied on Continuous Control Tasks This is the master thesi

Giacomo Arcieri 1 Mar 21, 2022
Open source implementation of "A Self-Supervised Descriptor for Image Copy Detection" (SSCD).

A Self-Supervised Descriptor for Image Copy Detection (SSCD) This is the open-source codebase for "A Self-Supervised Descriptor for Image Copy Detecti

Meta Research 68 Jan 04, 2023
Stereo Radiance Fields (SRF): Learning View Synthesis for Sparse Views of Novel Scenes

Stereo Radiance Fields (SRF): Learning View Synthesis for Sparse Views of Novel Scenes

111 Dec 29, 2022
CvT-ASSD: Convolutional vision-Transformerbased Attentive Single Shot MultiBox Detector (ICTAI 2021 CCF-C 会议)The 33rd IEEE International Conference on Tools with Artificial Intelligence

CvT-ASSD including extra CvT, CvT-SSD, VGG-ASSD models original-code-website: https://github.com/albert-jin/CvT-SSD new-code-website: https://github.c

金伟强 -上海大学人工智能小渣渣~ 5 Mar 07, 2022
A PyTorch Implementation of SphereFace.

SphereFace A PyTorch Implementation of SphereFace. The code can be trained on CASIA-Webface and the best accuracy on LFW is 99.22%. SphereFace: Deep H

carwin 685 Dec 09, 2022
A Context-aware Visual Attention-based training pipeline for Object Detection from a Webpage screenshot!

CoVA: Context-aware Visual Attention for Webpage Information Extraction Abstract Webpage information extraction (WIE) is an important step to create k

Keval Morabia 41 Jan 01, 2023
Predicting Price of house by considering ,house age, Distance from public transport

House-Price-Prediction Predicting Price of house by considering ,house age, Distance from public transport, No of convenient stores around house etc..

Musab Jaleel 1 Jan 08, 2022
Our implementation used for the MICCAI 2021 FLARE Challenge titled 'Efficient Multi-Organ Segmentation Using SpatialConfiguartion-Net with Low GPU Memory Requirements'.

Efficient Multi-Organ Segmentation Using SpatialConfiguartion-Net with Low GPU Memory Requirements Our implementation used for the MICCAI 2021 FLARE C

Franz Thaler 3 Sep 27, 2022
This tool uses Deep Learning to help you draw and write with your hand and webcam.

This tool uses Deep Learning to help you draw and write with your hand and webcam. A Deep Learning model is used to try to predict whether you want to have 'pencil up' or 'pencil down'.

lmagne 169 Dec 10, 2022
Near-Optimal Sparse Allreduce for Distributed Deep Learning (published in PPoPP'22)

Near-Optimal Sparse Allreduce for Distributed Deep Learning (published in PPoPP'22) Ok-Topk is a scheme for distributed training with sparse gradients

Shigang Li 9 Oct 29, 2022
A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.

About This repository provides data and code for the paper: Scalable Data Annotation Pipeline for High-Quality Large Speech Datasets Development (subm

Appen Repos 86 Dec 07, 2022
Repo for "TableParser: Automatic Table Parsing with Weak Supervision from Spreadsheets" at [email protected]

TableParser Repo for "TableParser: Automatic Table Parsing with Weak Supervision from Spreadsheets" at DS3 Lab 11 Dec 13, 2022

FasterAI: A library to make smaller and faster models with FastAI.

Fasterai fasterai is a library created to make neural network smaller and faster. It essentially relies on common compression techniques for networks

Nathan Hubens 193 Jan 01, 2023
Few-shot Learning of GPT-3

Few-shot Learning With Language Models This is a codebase to perform few-shot "in-context" learning using language models similar to the GPT-3 paper.

Tony Z. Zhao 224 Dec 28, 2022
LBK 26 Dec 28, 2022
Code and models for "Rethinking Deep Image Prior for Denoising" (ICCV 2021)

DIP-denosing This is a code repo for Rethinking Deep Image Prior for Denoising (ICCV 2021). Addressing the relationship between Deep image prior and e

Computer Vision Lab. @ GIST 36 Dec 29, 2022
An implementation of the methods presented in Causal-BALD: Deep Bayesian Active Learning of Outcomes to Infer Treatment-Effects from Observational Data.

An implementation of the methods presented in Causal-BALD: Deep Bayesian Active Learning of Outcomes to Infer Treatment-Effects from Observational Data.

Andrew Jesson 9 Apr 04, 2022
Stable Neural ODE with Lyapunov-Stable Equilibrium Points for Defending Against Adversarial Attacks

Stable Neural ODE with Lyapunov-Stable Equilibrium Points for Defending Against Adversarial Attacks Stable Neural ODE with Lyapunov-Stable Equilibrium

Kang Qiyu 8 Dec 12, 2022
A hybrid framework (neural mass model + ML) for SC-to-FC prediction

The current workflow simulates brain functional connectivity (FC) from structural connectivity (SC) with a neural mass model. Gradient descent is applied to optimize the parameters in the neural mass

Yilin Liu 1 Jan 26, 2022
Brain Tumor Detection with Tensorflow Neural Networks.

Brain-Tumor-Detection A convolutional neural network model built with Tensorflow & Keras to detect brain tumor and its different variants. Data of the

404ErrorNotFound 5 Aug 23, 2022