Scale-aware Automatic Augmentation for Object Detection (CVPR 2021)

Last update: Dec 29, 2022

Related tags

Deep Learning SA-AutoAug

Overview

SA-AutoAug

Scale-aware Automatic Augmentation for Object Detection

Yukang Chen, Yanwei Li, Tao Kong, Lu Qi, Ruihang Chu, Lei Li, Jiaya Jia

[Paper] [BibTeX]

This project provides the implementation for the CVPR 2021 paper "Scale-aware Automatic Augmentation for Object Detection". Scale-aware AutoAug provides a new search space and search metric to find effective data agumentation policies for object detection. It is implemented on maskrcnn-benchmark and FCOS. Both search and training codes have been released. To facilitate more use, we re-implement the training code based on Detectron2.

Installation

For maskrcnn-benchmark code, please follow INSTALL.md for instruction.

For FCOS code, please follow INSTALL.md for instruction.

For Detectron2 code, please follow INSTALL.md for instruction.

Search

(You can skip this step and directly train on our searched policies.)

To search with 8 GPUs, run:

cd /path/to/SA-AutoAug/maskrcnn-benchmark
export NGPUS=8
python3 -m torch.distributed.launch --nproc_per_node=$NGPUS tools/search.py --config-file configs/SA_AutoAug/retinanet_R-50-FPN_search.yaml OURPUT_DIR /path/to/searchlog_dir

Since we finetune on an existing baseline model during search, a baseline model is needed. You can download this model for search, or you can use other Retinanet baseline model trained by yourself.

Training

To train the searched policies on maskrcnn-benchmark (FCOS)

cd /path/to/SA-AutoAug/maskrcnn-benchmark
export NGPUS=8
python3 -m torch.distributed.launch --nproc_per_node=$NGPUS tools/train_net.py --config-file configs/SA_AutoAug/CONFIG_FILE  OUTPUT_DIR /path/to/traininglog_dir

For example, to train the retinanet ResNet-50 model with our searched data augmentation policies in 6x schedule:

cd /path/to/SA-AutoAug/maskrcnn-benchmark
export NGPUS=8
python3 -m torch.distributed.launch --nproc_per_node=$NGPUS tools/train_net.py --config-file configs/SA_AutoAug/retinanet_R-50-FPN_6x.yaml  OUTPUT_DIR models/retinanet_R-50-FPN_6x_SAAutoAug

To train the searched policies on detectron2

cd /path/to/SA-AutoAug/detectron2
python3 ./tools/train_net.py --num-gpus 8 --config-file ./configs/COCO-Detection/SA_AutoAug/CONFIG_FILE OUTPUT_DIR /path/to/traininglog_dir

For example, to train the retinanet ResNet-50 model with our searched data augmentation policies in 6x schedule:

cd /path/to/SA-AutoAug/detectron2
python3 ./tools/train_net.py --num-gpus 8 --config-file ./configs/COCO-Detection/SA_AutoAug/retinanet_R_50_FPN_6x.yaml OUTPUT_DIR output_retinanet_R_50_FPN_6x_SAAutoAug

Results

We provide the results on COCO val2017 set with pretrained models.

Based on maskrcnn-benchmark

Method	Backbone	AP_bbox	Download
Faster R-CNN	ResNet-50	41.8	Model
Faster R-CNN	ResNet-101	44.2	Model
RetinaNet	ResNet-50	41.4	Model
RetinaNet	ResNet-101	42.8	Model
Mask R-CNN	ResNet-50	42.8	Model
Mask R-CNN	ResNet-101	45.3	Model

Based on FCOS

Method	Backbone	AP_bbox	Download
FCOS	ResNet-50	42.6	Model
FCOS	ResNet-101	44.0	Model
ATSS	ResNext-101-32x8d-dcnv2	48.5	Model
ATSS	ResNext-101-32x8d-dcnv2 (1200 size)	49.6	Model

Based on Detectron2

Method	Backbone	AP_bbox	Download
Faster R-CNN	ResNet-50	41.9	Model - Metrics
Faster R-CNN	ResNet-101	44.2	Model - Metrics
RetinaNet	ResNet-50	40.8	Model - Metrics
RetinaNet	ResNet-101	43.1	Model - Metrics
Mask R-CNN	ResNet-50	42.9	Model - Metrics
Mask R-CNN	ResNet-101	45.6	Model - Metrics

Citing SA-AutoAug

Consider cite SA-Autoaug in your publications if it helps your research.

@inproceedings{saautoaug,
  title={Scale-aware Automatic Augmentation for Object Detection},
  author={Yukang Chen, Yanwei Li, Tao Kong, Lu Qi, Ruihang Chu, Lei Li, Jiaya Jia},
  booktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2021}
}

Acknowledgments

This training code of this project is built on maskrcnn-benchmark, Detectron2, FCOS, and ATSS. The search code of this project is modified from DetNAS. Some augmentation code and settings follow AutoAug-Det. We thanks a lot for the authors of these projects.

Note that:

(1) We also provides script files for search and training in maskrcnn-benchmark, FCOS, and, detectron2.

(2) Any issues or pull requests on this project are welcome. In addition, if you meet problems when applying the augmentations to other datasets or codebase, feel free to contact Yukang Chen ([email protected]).

Scale-aware Automatic Augmentation for Object Detection (CVPR 2021)

Related tags

Overview

SA-AutoAug

Installation

Search

Training

Results

Based on maskrcnn-benchmark

Based on FCOS

Based on Detectron2

Citing SA-AutoAug

Acknowledgments

Owner

DV Lab

MAGMA - a GPT-style multimodal model that can understand any combination of images and language

A comprehensive list of published machine learning applications to cosmology

MinkLoc3D-SI: 3D LiDAR place recognition with sparse convolutions,spherical coordinates, and intensity

Code and models for "Pano3D: A Holistic Benchmark and a Solid Baseline for 360 Depth Estimation", OmniCV Workshop @ CVPR21.

UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation

The 2nd place solution of 2021 google landmark retrieval on kaggle.

[ICCV'21] Learning Conditional Knowledge Distillation for Degraded-Reference Image Quality Assessment

Tool cek opsi checkpoint facebook!

My implementation of DeepMind's Perceiver

LaBERT - A length-controllable and non-autoregressive image captioning model.

Using deep learning model to detect breast cancer.

hySLAM is a hybrid SLAM/SfM system designed for mapping

Official respository for "Modeling Defocus-Disparity in Dual-Pixel Sensors", ICCP 2020

Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow

Contrastive Learning for Metagenomic Binning

Code for the Active Speakers in Context Paper (CVPR2020)

A pure PyTorch implementation of the loss described in "Online Segment to Segment Neural Transduction"

TriMap: Large-scale Dimensionality Reduction Using Triplets

[CVPR 2021] "Multimodal Motion Prediction with Stacked Transformers": official code implementation and project page.

IsoGCN code for ICLR2021