Official implementation for CVPR 2021 paper: Adaptive Class Suppression Loss for Long-Tail Object Detection

Related tags

Deep LearningACSL
Overview

Adaptive Class Suppression Loss for Long-Tail Object Detection

This repo is the official implementation for CVPR 2021 paper: Adaptive Class Suppression Loss for Long-Tail Object Detection. [Paper]

Framework

Requirements

1. Environment:

The requirements are exactly the same as BalancedGroupSoftmax. We tested on the following settings:

  • python 3.7
  • cuda 10.0
  • pytorch 1.2.0
  • torchvision 0.4.0
  • mmcv 0.2.14
conda create -n mmdet python=3.7 -y
conda activate mmdet

pip install cython
pip install numpy
pip install torch
pip install torchvision
pip install pycocotools
pip install matplotlib
pip install terminaltables

# download the source code of mmcv 0.2.14 from https://github.com/open-mmlab/mmcv/tree/v0.2.14
cd mmcv-0.2.14
pip install -v -e .
cd ../

git clone https://github.com/CASIA-IVA-Lab/ACSL.git

cd ACSL/lvis-api/
python setup.py develop

cd ../
python setup.py develop

2. Data:

a. For dataset images:

# Make sure you are in dir ACSL

mkdir data
cd data
mkdir lvis
mkdir pretrained_models
mkdir download_models
  • If you already have COCO2017 dataset, it will be great. Link train2017 and val2017 folders under folder lvis.
  • If you do not have COCO2017 dataset, please download: COCO train set and COCO val set and unzip these files and mv them under folder lvis.

b. For dataset annotations:

c. For pretrained models:

Download the corresponding pre-trained models below.

  • To train baseline models, we need models trained on COCO to initialize. Please download the corresponding COCO models at mmdetection model zoo.

  • Move these model files to ./data/pretrained_models/

d. For download_models:

Download the trained baseline models and ACSL models from BaiduYun, code is 2jp3

  • To train ACSL models, we need corresponding baseline models trained on LVIS to initialize and fix all parameters except for the last FC layer.

  • Move these model files to ./data/download_models/

After all these operations, the folder data should be like this:

    data
    ├── lvis
    │   ├── lvis_v0.5_train.json
    │   ├── lvis_v0.5_val.json
    │   ├── train2017
    │   │   ├── 000000100582.jpg
    │   │   ├── 000000102411.jpg
    │   │   ├── ......
    │   └── val2017
    │       ├── 000000062808.jpg
    │       ├── 000000119038.jpg
    │       ├── ......
    └── pretrained_models
    │       ├── faster_rcnn_r50_fpn_2x_20181010-443129e1.pth
    │       ├── ......
    └── download_models
            ├── R50-baseline.pth
            ├── ......

Training

Note: Please make sure that you have prepared the pretrained_models and the download_models and they have been put to the path specified in ${CONIFG_FILE}.

Use the following commands to train a model.

# Single GPU
python tools/train.py ${CONFIG_FILE}

# Multi GPU distributed training
./tools/dist_train.sh ${CONFIG_FILE} ${GPU_NUM} [optional arguments]

All config files are under ./configs/.

  • ./configs/baselines: all baseline models.
  • ./configs/acsl: models for ACSL models.

For example, to train a ACSL model with Faster R-CNN R50-FPN:

# Single GPU
python tools/train.py configs/acsl/faster_rcnn_r50_fpn_1x_lvis_tunefc_acsl.py

# Multi GPU distributed training (for 8 gpus)
./tools/dist_train.sh configs/acsl/faster_rcnn_r50_fpn_1x_lvis_tunefc_acsl.py 8

Important: The default learning rate in config files is for 8 GPUs and 2 img/gpu (batch size = 8*2 = 16). According to the Linear Scaling Rule, you need to set the learning rate proportional to the batch size if you use different GPUs or images per GPU, e.g., lr=0.01 for 4 GPUs * 2 img/gpu and lr=0.08 for 16 GPUs * 4 img/gpu. (Cited from mmdetection.)

Testing

Use the following commands to test a trained model.

# single gpu test
python tools/test_lvis.py \
 ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}]

# multi-gpu testing
./tools/dist_test_lvis.sh \
 ${CONFIG_FILE} ${CHECKPOINT_FILE} ${GPU_NUM} [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}]
  • $RESULT_FILE: Filename of the output results in pickle format. If not specified, the results will not be saved to a file.
  • $EVAL_METRICS: Items to be evaluated on the results. bbox for bounding box evaluation only. bbox segm for bounding box and mask evaluation.

For example (assume that you have finished the training of ACSL models.):

  • To evaluate the trained ACSL model with Faster R-CNN R50-FPN for object detection:
# single-gpu testing
python tools/test_lvis.py configs/acsl/faster_rcnn_r50_fpn_1x_lvis_tunefc_acsl.py \
 ./work_dirs/acsl/faster_rcnn_r50_fpn_1x_lvis_tunefc_acsl/epoch_12.pth \
  --out acsl_val_result.pkl --eval bbox

# multi-gpu testing (8 gpus)
./tools/dist_test_lvis.sh configs/acsl/faster_rcnn_r50_fpn_1x_lvis_tunefc_acsl.py \
./work_dirs/acsl/faster_rcnn_r50_fpn_1x_lvis_tunefc_acsl/epoch_12.pth 8 \
--out acsl_val_result.pkl --eval bbox

Results and models

Please refer to our paper for more details.

Method Models bbox mAP Config file Pretrained Model Model
baseline R50-FPN 21.18 file COCO-R50 R50-baseline
ACSL R50-FPN 26.36 file R50-baseline R50-acsl
baseline R101-FPN 22.36 file COCO-R101 R101-baseline
ACSL R101-FPN 27.49 file R101-baseline R101-acsl
baseline X101-FPN 24.70 file COCO-X101 X101-baseline
ACSL X101-FPN 28.93 file X101-baseline X101-acsl
baseline Cascade-R101 25.14 file COCO-Cas-R101 Cas-R101-baseline
ACSL Cascade-R101 29.71 file Cas-R101-baseline Cas-R101-acsl
baseline Cascade-X101 27.14 file COCO-Cas-X101 Cas-X101-baseline
ACSL Cascade-X101 31.47 file Cas-X101-baseline Cas-X101-acsl

Important: The code of BaiduYun is 2jp3

Citation

@inproceedings{wang2021adaptive,
  title={Adaptive Class Suppression Loss for Long-Tail Object Detection},
  author={Wang, Tong and Zhu, Yousong and Zhao, Chaoyang and Zeng, Wei and Wang, Jinqiao and Tang, Ming},
  journal={CVPR},
  year={2021}
}

Credit

This code is largely based on BalancedGroupSoftmax and mmdetection v1.0.rc0 and LVIS API.

Owner
CASIA-IVA-Lab
Image & Video Analysis Group, Institute of Automation, Chinese Academy of Sciences
CASIA-IVA-Lab
The all new way to turn your boring vector meshes into the new fad in town; Voxels!

Voxelator The all new way to turn your boring vector meshes into the new fad in town; Voxels! Notes: I have not tested this on a rotated mesh. With fu

6 Feb 03, 2022
robomimic: A Modular Framework for Robot Learning from Demonstration

robomimic [Homepage]   [Documentation]   [Study Paper]   [Study Website]   [ARISE Initiative] Latest Updates [08/09/2021] v0.1.0: Initial code and pap

ARISE Initiative 178 Jan 05, 2023
Image process framework based on plugin like imagej, it is esay to glue with scipy.ndimage, scikit-image, opencv, simpleitk, mayavi...and any libraries based on numpy

Introduction ImagePy is an open source image processing framework written in Python. Its UI interface, image data structure and table data structure a

ImagePy 1.2k Dec 29, 2022
基于YoloX目标检测+DeepSort算法实现多目标追踪Baseline

项目简介: 使用YOLOX+Deepsort实现车辆行人追踪和计数,代码封装成一个Detector类,更容易嵌入到自己的项目中。 代码地址(欢迎star): https://github.com/Sharpiless/yolox-deepsort/ 最终效果: 运行demo: python demo

114 Dec 30, 2022
Pacman-AI - AI project designed by UC Berkeley. Designed reflex and minimax agents for the game Pacman.

Pacman AI Jussi Doherty CAP 4601 - Introduction to Artificial Intelligence - Fall 2020 Python version 3.0+ Source of this project This repo contains a

Jussi Doherty 1 Jan 03, 2022
Code for our EMNLP 2021 paper “Heterogeneous Graph Neural Networks for Keyphrase Generation”

GATER This repository contains the code for our EMNLP 2021 paper “Heterogeneous Graph Neural Networks for Keyphrase Generation”. Our implementation is

Jiacheng Ye 12 Nov 24, 2022
PyTorch implementation of Trust Region Policy Optimization

PyTorch implementation of TRPO Try my implementation of PPO (aka newer better variant of TRPO), unless you need to you TRPO for some specific reasons.

Ilya Kostrikov 366 Nov 15, 2022
Invertible conditional GANs for image editing

Invertible Conditional GANs This is the implementation of the IcGAN model proposed in our paper: Invertible Conditional GANs for image editing. Novemb

Guim 278 Dec 12, 2022
[ICCV' 21] "Unsupervised Point Cloud Pre-training via Occlusion Completion"

OcCo: Unsupervised Point Cloud Pre-training via Occlusion Completion This repository is the official implementation of paper: "Unsupervised Point Clou

Hanchen 204 Dec 24, 2022
The code for our paper CrossFormer: A Versatile Vision Transformer Based on Cross-scale Attention.

CrossFormer This repository is the code for our paper CrossFormer: A Versatile Vision Transformer Based on Cross-scale Attention. Introduction Existin

cheerss 238 Jan 06, 2023
A Comprehensive Study on Learning-Based PE Malware Family Classification Methods

A Comprehensive Study on Learning-Based PE Malware Family Classification Methods Datasets Because of copyright issues, both the MalwareBazaar dataset

8 Oct 21, 2022
Python-based Informatics Kit for Analysing Chemical Units

INSTALLATION Python-based Informatics Kit for the Analysis of Chemical Units Step 1: Make a conda environment: conda create -n pikachu python=3.9 cond

47 Dec 23, 2022
[2021][ICCV][FSNet] Full-Duplex Strategy for Video Object Segmentation

Full-Duplex Strategy for Video Object Segmentation (ICCV, 2021) Authors: Ge-Peng Ji, Keren Fu, Zhe Wu, Deng-Ping Fan*, Jianbing Shen, & Ling Shao This

Daniel-Ji 55 Dec 22, 2022
Code and data (Incidents Dataset) for ECCV 2020 Paper "Detecting natural disasters, damage, and incidents in the wild".

Incidents Dataset See the following pages for more details: Project page: IncidentsDataset.csail.mit.edu. ECCV 2020 Paper "Detecting natural disasters

Ethan Weber 67 Dec 27, 2022
Stacked Hourglass Network with a Multi-level Attention Mechanism: Where to Look for Intervertebral Disc Labeling

⚠️ ‎‎‎ A more recent and actively-maintained version of this code is available in ivadomed Stacked Hourglass Network with a Multi-level Attention Mech

Reza Azad 14 Oct 24, 2022
NLP made easy

GluonNLP: Your Choice of Deep Learning for NLP GluonNLP is a toolkit that helps you solve NLP problems. It provides easy-to-use tools that helps you l

Distributed (Deep) Machine Learning Community 2.5k Jan 04, 2023
The codes and related files to reproduce the results for Image Similarity Challenge Track 2.

ISC-Track2-Submission The codes and related files to reproduce the results for Image Similarity Challenge Track 2. Required dependencies To begin with

Wenhao Wang 89 Jan 02, 2023
Official PyTorch Implementation of HELP: Hardware-adaptive Efficient Latency Prediction for NAS via Meta-Learning (NeurIPS 2021 Spotlight)

[NeurIPS 2021 Spotlight] HELP: Hardware-adaptive Efficient Latency Prediction for NAS via Meta-Learning [Paper] This is Official PyTorch implementatio

42 Nov 01, 2022
ONNX-GLPDepth - Python scripts for performing monocular depth estimation using the GLPDepth model in ONNX

ONNX-GLPDepth - Python scripts for performing monocular depth estimation using the GLPDepth model in ONNX

Ibai Gorordo 18 Nov 06, 2022
Pytorch implementation of various High Dynamic Range (HDR) Imaging algorithms

Deep High Dynamic Range Imaging Benchmark This repository is the pytorch impleme

Tianhong Dai 5 Nov 16, 2022