SWA Object Detection

Overview

SWA Object Detection

This project hosts the scripts for training SWA object detectors, as presented in our paper:

@article{zhang2020swa,
  title={SWA Object Detection},
  author={Zhang, Haoyang and Wang, Ying and Dayoub, Feras and S{\"u}nderhauf, Niko},
  journal={arXiv preprint arXiv:2012.12645},
  year={2020}
}

The full paper is available at: https://arxiv.org/abs/2012.12645.

Introduction

Do you want to improve 1.0 AP for your object detector without any inference cost and any change to your detector? Let us tell you such a recipe. It is surprisingly simple: train your detector for an extra 12 epochs using cyclical learning rates and then average these 12 checkpoints as your final detection model. This potent recipe is inspired by Stochastic Weights Averaging (SWA), which is proposed in [1] for improving generalization in deep neural networks. We found it also very effective in object detection. In this work, we systematically investigate the effects of applying SWA to object detection as well as instance segmentation. Through extensive experiments, we discover a good policy of performing SWA in object detection, and we consistently achieve ~1.0 AP improvement over various popular detectors on the challenging COCO benchmark. We hope this work will make more researchers in object detection know this technique and help them train better object detectors.

SWA Object Detection: averaging multiple detection models leads to a better one.

Updates

  • 2020.01.08 Reimplement the code and now it is more convenient, more flexible and easier to perform both the conventional training and SWA training. See Instructions.
  • 2020.01.07 Update to MMDetection v2.8.0.
  • 2020.12.24 Release the code.

Installation

  • This project is based on MMDetection. Therefore the installation is the same as original MMDetection.

  • Please check get_started.md for installation. Note that you should change the version of PyTorch and CUDA to yours when installing mmcv in step 3 and clone this repo instead of MMdetection in step 4.

  • If you run into problems with pycocotools, please install it by:

    pip install "git+https://github.com/open-mmlab/cocoapi.git#subdirectory=pycocotools"
    

Usage of MMDetection

MMDetection provides colab tutorial, and full guidance for quick run with existing dataset and with new dataset for beginners. There are also tutorials for finetuning models, adding new dataset, designing data pipeline, customizing models, customizing runtime settings and useful tools.

Please refer to FAQ for frequently asked questions.

Instructions

We add a SWA training phase to the object detector training process, implement a SWA hook that helps process averaged models, and write a SWA config for conveniently deploying SWA training in training various detectors. We also provide many config files for reproducing the results in the paper.

By including the SWA config in detector config files and setting related parameters, you can have different SWA training modes.

  1. Two-pahse mode. In this mode, the training will begin with the traditional training phase, and it continues for epochs. After that, SWA training will start, with loading the best model on the validation from the previous training phase (becasue swa_load_from = 'best_bbox_mAP.pth'in the SWA config).

    As shown in swa_vfnet_r50 config, the SWA config is included at line 4 and only the SWA optimizer is reset at line 118 in this script. Note that configuring parameters in local scripts will overwrite those values inherited from the SWA config.

    You can change those parameters that are included in the SWA config to use different optimizers or different learning rate schedules for the SWA training. For example, to use a different initial learning rate, say 0.02, you just need to set swa_optimizer = dict(type='SGD', lr=0.02, momentum=0.9, weight_decay=0.0001) in the SWA config (global effect) or in the swa_vfnet_r50 config (local effect).

    To start the training, run:

    ./tools/dist_train.sh configs/swa/swa_vfnet_r50_fpn_1x_coco.py 8
    
    
  2. Only-SWA mode. In this mode, the traditional training is skipped and only the SWA training is performed. In general, this mode should work with a pre-trained detection model which you can download from the MMDetection model zoo.

    Have a look at the swa_mask_rcnn_r101 config. By setting only_swa_training = True and swa_load_from = mask_rcnn_pretraind_model, this script conducts only SWA training, starting from a pre-trained detection model. To start the training, run:

    ./tools/dist_train.sh configs/swa/swa_mask_rcnn_r101_fpn_2x_coco.py 8
    
    

In both modes, we have implemented the validation stage and saving functions for the SWA model. Thus, it would be easy to monitor the performance and select the best SWA model.

Results and Models

For your convenience, we provide the following SWA models. These models are obtained by averaging checkpoints that are trained with cyclical learning rates for 12 epochs.

Model bbox AP (val) segm AP (val)     Download    
SWA-MaskRCNN-R50-1x-0.02-0.0002-38.2-34.7 39.1, +0.9 35.5, +0.8 model | config
SWA-MaskRCNN-R101-1x-0.02-0.0002-40.0-36.1 41.0, +1.0 37.0, +0.9 model | config
SWA-MaskRCNN-R101-2x-0.02-0.0002-40.8-36.6 41.7, +0.9 37.4, +0.8 model | config
SWA-FasterRCNN-R50-1x-0.02-0.0002-37.4 38.4, +1.0 - model | config
SWA-FasterRCNN-R101-1x-0.02-0.0002-39.4 40.3, +0.9 - model | config
SWA-FasterRCNN-R101-2x-0.02-0.0002-39.8 40.7, +0.9 - model | config
SWA-RetinaNet-R50-1x-0.01-0.0001-36.5 37.8, +1.3 - model | config
SWA-RetinaNet-R101-1x-0.01-0.0001-38.5 39.7, +1.2 - model | config
SWA-RetinaNet-R101-2x-0.01-0.0001-38.9 40.0, +1.1 - model | config
SWA-FCOS-R50-1x-0.01-0.0001-36.6 38.0, +1.4 - model | config
SWA-FCOS-R101-1x-0.01-0.0001-39.2 40.3, +1.1 - model | config
SWA-FCOS-R101-2x-0.01-0.0001-39.1 40.2, +1.1 - model | config
SWA-YOLOv3(320)-D53-273e-0.001-0.00001-27.9 28.7, +0.8 - model | config
SWA-YOLOv3(680)-D53-273e-0.001-0.00001-33.4 34.2, +0.8 - model | config
SWA-VFNet-R50-1x-0.01-0.0001-41.6 42.8, +1.2 - model | config
SWA-VFNet-R101-1x-0.01-0.0001-43.0 44.3, +1.3 - model | config
SWA-VFNet-R101-2x-0.01-0.0001-43.5 44.5, +1.0 - model | config

Notes:

  • SWA-MaskRCNN-R50-1x-0.02-0.0002-38.2-34.7 means this SWA model is produced based on the pre-trained Mask RCNN model that has a ResNet50 backbone, is trained under 1x schedule with the initial learning rate 0.02 and ending learning rate 0.0002, and achieves 38.2 bbox AP and 34.7 mask AP on the COCO val2017 respectively. This SWA model acheives 39.1 bbox AP and 35.5 mask AP, which are higher than the pre-trained model by 0.9 bbox AP and 0.8 mask AP respectively. This rule applies to other object detectors.

  • In addition to these baseline detectors, SWA can also improve more powerful detectors. One example is VFNetX whose performance on the COCO val2017 is improved from 52.2 AP to 53.4 AP (+1.2 AP).

  • More detailed results including AP50 and AP75 can be found here.

Contributing

Any pull requests or issues are welcome.

Citation

Please consider citing our paper in your publications if the project helps your research. BibTeX reference is as follows:

@article{zhang2020swa,
  title={SWA Object Detection},
  author={Zhang, Haoyang and Wang, Ying and Dayoub, Feras and S{\"u}nderhauf, Niko},
  journal={arXiv preprint arXiv:2012.12645},
  year={2020}
}

Acknowledgment

Many thanks to Dr Marlies Hankel and MASSIVE HPC for supporting precious GPU computation resources!

We also would like to thank MMDetection team for producing this great object detection toolbox.

License

This project is released under the Apache 2.0 license.

References

[1] Averaging Weights Leads to Wider Optima and Better Generalization; Pavel Izmailov, Dmitry Podoprikhin, Timur Garipov, Dmitry Vetrov, Andrew Gordon Wilson; Uncertainty in Artificial Intelligence (UAI), 2018

[NeurIPS 2021] Source code for the paper "Qu-ANTI-zation: Exploiting Neural Network Quantization for Achieving Adversarial Outcomes"

Qu-ANTI-zation This repository contains the code for reproducing the results of our paper: Qu-ANTI-zation: Exploiting Quantization Artifacts for Achie

Secure AI Systems Lab 8 Mar 26, 2022
InsCLR: Improving Instance Retrieval with Self-Supervision

InsCLR: Improving Instance Retrieval with Self-Supervision This is an official PyTorch implementation of the InsCLR paper. Download Dataset Dataset Im

Zelu Deng 25 Aug 30, 2022
Source code of CIKM2021 Long Paper "PSSL: Self-supervised Learning for Personalized Search with Contrastive Sampling".

PSSL Source code of CIKM2021 Long Paper "PSSL: Self-supervised Learning for Personalized Search with Contrastive Sampling". It consists of the pre-tra

2 Dec 21, 2021
Tf alloc - Simplication of GPU allocation for Tensorflow2

tf_alloc Simpliying GPU allocation for Tensorflow Developer: korkite (Junseo Ko)

Junseo Ko 3 Feb 10, 2022
Face recognize and crop them

Face Recognize Cropping Module Source 아이디어 Face Alignment with OpenCV and Python Requirement 필요 라이브러리 imutil dlib python-opence (cv2) Usage 사용 방법 open

Cho Moon Gi 1 Feb 15, 2022
Naszilla is a Python library for neural architecture search (NAS)

A repository to compare many popular NAS algorithms seamlessly across three popular benchmarks (NASBench 101, 201, and 301). You can implement your ow

270 Jan 03, 2023
Official source code of Fast Point Transformer, CVPR 2022

Fast Point Transformer Project Page | Paper This repository contains the official source code and data for our paper: Fast Point Transformer Chunghyun

182 Dec 23, 2022
Training Structured Neural Networks Through Manifold Identification and Variance Reduction

Training Structured Neural Networks Through Manifold Identification and Variance Reduction This repository is a pytorch implementation of the Regulari

0 Dec 23, 2021
UNAVOIDS: Unsupervised and Nonparametric Approach for Visualizing Outliers and Invariant Detection Scoring

UNAVOIDS: Unsupervised and Nonparametric Approach for Visualizing Outliers and Invariant Detection Scoring Code Summary aggregate.py: this script aggr

1 Dec 28, 2021
A PyTorch implementation of "Multi-Scale Contrastive Siamese Networks for Self-Supervised Graph Representation Learning", IJCAI-21

MERIT A PyTorch implementation of our IJCAI-21 paper Multi-Scale Contrastive Siamese Networks for Self-Supervised Graph Representation Learning. Depen

Graph Analysis & Deep Learning Laboratory, GRAND 32 Jan 02, 2023
Strongly local p-norm-cut algorithms for semi-supervised learning and local graph clustering

Strongly local p-norm-cut algorithms for semi-supervised learning and local graph clustering

Meng Liu 2 Jul 19, 2022
This is RFA-Toolbox, a simple and easy-to-use library that allows you to optimize your neural network architectures using receptive field analysis (RFA) and create graph visualizations of your architecture.

ReceptiveFieldAnalysisToolbox This is RFA-Toolbox, a simple and easy-to-use library that allows you to optimize your neural network architectures usin

84 Nov 23, 2022
Code for "Diversity can be Transferred: Output Diversification for White- and Black-box Attacks"

Output Diversified Sampling (ODS) This is the github repository for the NeurIPS 2020 paper "Diversity can be Transferred: Output Diversification for W

50 Dec 11, 2022
《Truly shift-invariant convolutional neural networks》(2021)

Truly shift-invariant convolutional neural networks [Paper] Authors: Anadi Chaman and Ivan Dokmanić Convolutional neural networks were always assumed

Anadi Chaman 46 Dec 19, 2022
Python package for visualizing the loss landscape of parameterized quantum algorithms.

orqviz A Python package for easily visualizing the loss landscape of Variational Quantum Algorithms by Zapata Computing Inc. orqviz provides a collect

Zapata Computing, Inc. 75 Dec 30, 2022
Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.

WECHSEL Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models. arXiv: https://arx

Institute of Computational Perception 45 Dec 29, 2022
Malmo Collaborative AI Challenge - Team Pig Catcher

The Malmo Collaborative AI Challenge - Team Pig Catcher Approach The challenge involves 2 agents who can either cooperate or defect. The optimal polic

Kai Arulkumaran 66 Jun 29, 2022
This project is based on RIFE and aims to make RIFE more practical for users by adding various features and design new models

CPM 项目描述 CPM(Chinese Pretrained Models)模型是北京智源人工智能研究院和清华大学发布的中文大规模预训练模型。官方发布了三种规模的模型,参数量分别为109M、334M、2.6B,用户需申请与通过审核,方可下载。 由于原项目需要考虑大模型的训练和使用,需要安装较为复杂

hzwer 190 Jan 08, 2023
Code and Resources for the Transformer Encoder Reasoning Network (TERN)

Transformer Encoder Reasoning Network Code for the cross-modal visual-linguistic retrieval method from "Transformer Reasoning Network for Image-Text M

Nicola Messina 53 Dec 30, 2022
Google Brain - Ventilator Pressure Prediction

Google Brain - Ventilator Pressure Prediction https://www.kaggle.com/c/ventilator-pressure-prediction The ventilator data used in this competition was

Samuele Cucchi 1 Feb 11, 2022