Official implementation for CVPR 2021 paper: Adaptive Class Suppression Loss for Long-Tail Object Detection

Related tags

Deep LearningACSL
Overview

Adaptive Class Suppression Loss for Long-Tail Object Detection

This repo is the official implementation for CVPR 2021 paper: Adaptive Class Suppression Loss for Long-Tail Object Detection. [Paper]

Framework

Requirements

1. Environment:

The requirements are exactly the same as BalancedGroupSoftmax. We tested on the following settings:

  • python 3.7
  • cuda 10.0
  • pytorch 1.2.0
  • torchvision 0.4.0
  • mmcv 0.2.14
conda create -n mmdet python=3.7 -y
conda activate mmdet

pip install cython
pip install numpy
pip install torch
pip install torchvision
pip install pycocotools
pip install matplotlib
pip install terminaltables

# download the source code of mmcv 0.2.14 from https://github.com/open-mmlab/mmcv/tree/v0.2.14
cd mmcv-0.2.14
pip install -v -e .
cd ../

git clone https://github.com/CASIA-IVA-Lab/ACSL.git

cd ACSL/lvis-api/
python setup.py develop

cd ../
python setup.py develop

2. Data:

a. For dataset images:

# Make sure you are in dir ACSL

mkdir data
cd data
mkdir lvis
mkdir pretrained_models
mkdir download_models
  • If you already have COCO2017 dataset, it will be great. Link train2017 and val2017 folders under folder lvis.
  • If you do not have COCO2017 dataset, please download: COCO train set and COCO val set and unzip these files and mv them under folder lvis.

b. For dataset annotations:

c. For pretrained models:

Download the corresponding pre-trained models below.

  • To train baseline models, we need models trained on COCO to initialize. Please download the corresponding COCO models at mmdetection model zoo.

  • Move these model files to ./data/pretrained_models/

d. For download_models:

Download the trained baseline models and ACSL models from BaiduYun, code is 2jp3

  • To train ACSL models, we need corresponding baseline models trained on LVIS to initialize and fix all parameters except for the last FC layer.

  • Move these model files to ./data/download_models/

After all these operations, the folder data should be like this:

    data
    ├── lvis
    │   ├── lvis_v0.5_train.json
    │   ├── lvis_v0.5_val.json
    │   ├── train2017
    │   │   ├── 000000100582.jpg
    │   │   ├── 000000102411.jpg
    │   │   ├── ......
    │   └── val2017
    │       ├── 000000062808.jpg
    │       ├── 000000119038.jpg
    │       ├── ......
    └── pretrained_models
    │       ├── faster_rcnn_r50_fpn_2x_20181010-443129e1.pth
    │       ├── ......
    └── download_models
            ├── R50-baseline.pth
            ├── ......

Training

Note: Please make sure that you have prepared the pretrained_models and the download_models and they have been put to the path specified in ${CONIFG_FILE}.

Use the following commands to train a model.

# Single GPU
python tools/train.py ${CONFIG_FILE}

# Multi GPU distributed training
./tools/dist_train.sh ${CONFIG_FILE} ${GPU_NUM} [optional arguments]

All config files are under ./configs/.

  • ./configs/baselines: all baseline models.
  • ./configs/acsl: models for ACSL models.

For example, to train a ACSL model with Faster R-CNN R50-FPN:

# Single GPU
python tools/train.py configs/acsl/faster_rcnn_r50_fpn_1x_lvis_tunefc_acsl.py

# Multi GPU distributed training (for 8 gpus)
./tools/dist_train.sh configs/acsl/faster_rcnn_r50_fpn_1x_lvis_tunefc_acsl.py 8

Important: The default learning rate in config files is for 8 GPUs and 2 img/gpu (batch size = 8*2 = 16). According to the Linear Scaling Rule, you need to set the learning rate proportional to the batch size if you use different GPUs or images per GPU, e.g., lr=0.01 for 4 GPUs * 2 img/gpu and lr=0.08 for 16 GPUs * 4 img/gpu. (Cited from mmdetection.)

Testing

Use the following commands to test a trained model.

# single gpu test
python tools/test_lvis.py \
 ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}]

# multi-gpu testing
./tools/dist_test_lvis.sh \
 ${CONFIG_FILE} ${CHECKPOINT_FILE} ${GPU_NUM} [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}]
  • $RESULT_FILE: Filename of the output results in pickle format. If not specified, the results will not be saved to a file.
  • $EVAL_METRICS: Items to be evaluated on the results. bbox for bounding box evaluation only. bbox segm for bounding box and mask evaluation.

For example (assume that you have finished the training of ACSL models.):

  • To evaluate the trained ACSL model with Faster R-CNN R50-FPN for object detection:
# single-gpu testing
python tools/test_lvis.py configs/acsl/faster_rcnn_r50_fpn_1x_lvis_tunefc_acsl.py \
 ./work_dirs/acsl/faster_rcnn_r50_fpn_1x_lvis_tunefc_acsl/epoch_12.pth \
  --out acsl_val_result.pkl --eval bbox

# multi-gpu testing (8 gpus)
./tools/dist_test_lvis.sh configs/acsl/faster_rcnn_r50_fpn_1x_lvis_tunefc_acsl.py \
./work_dirs/acsl/faster_rcnn_r50_fpn_1x_lvis_tunefc_acsl/epoch_12.pth 8 \
--out acsl_val_result.pkl --eval bbox

Results and models

Please refer to our paper for more details.

Method Models bbox mAP Config file Pretrained Model Model
baseline R50-FPN 21.18 file COCO-R50 R50-baseline
ACSL R50-FPN 26.36 file R50-baseline R50-acsl
baseline R101-FPN 22.36 file COCO-R101 R101-baseline
ACSL R101-FPN 27.49 file R101-baseline R101-acsl
baseline X101-FPN 24.70 file COCO-X101 X101-baseline
ACSL X101-FPN 28.93 file X101-baseline X101-acsl
baseline Cascade-R101 25.14 file COCO-Cas-R101 Cas-R101-baseline
ACSL Cascade-R101 29.71 file Cas-R101-baseline Cas-R101-acsl
baseline Cascade-X101 27.14 file COCO-Cas-X101 Cas-X101-baseline
ACSL Cascade-X101 31.47 file Cas-X101-baseline Cas-X101-acsl

Important: The code of BaiduYun is 2jp3

Citation

@inproceedings{wang2021adaptive,
  title={Adaptive Class Suppression Loss for Long-Tail Object Detection},
  author={Wang, Tong and Zhu, Yousong and Zhao, Chaoyang and Zeng, Wei and Wang, Jinqiao and Tang, Ming},
  journal={CVPR},
  year={2021}
}

Credit

This code is largely based on BalancedGroupSoftmax and mmdetection v1.0.rc0 and LVIS API.

Owner
CASIA-IVA-Lab
Image & Video Analysis Group, Institute of Automation, Chinese Academy of Sciences
CASIA-IVA-Lab
MOpt-AFL provided by the paper "MOPT: Optimized Mutation Scheduling for Fuzzers"

MOpt-AFL 1. Description MOpt-AFL is a AFL-based fuzzer that utilizes a customized Particle Swarm Optimization (PSO) algorithm to find the optimal sele

172 Dec 18, 2022
Autoregressive Predictive Coding: An unsupervised autoregressive model for speech representation learning

Autoregressive Predictive Coding This repository contains the official implementation (in PyTorch) of Autoregressive Predictive Coding (APC) proposed

iamyuanchung 173 Dec 18, 2022
Optimized Gillespie algorithm for simulating Stochastic sPAtial models of Cancer Evolution (OG-SPACE)

OG-SPACE Introduction Optimized Gillespie algorithm for simulating Stochastic sPAtial models of Cancer Evolution (OG-SPACE) is a computational framewo

Data and Computational Biology Group UNIMIB (was BI*oinformatics MI*lan B*icocca) 0 Nov 17, 2021
Official implementation for paper: Feature-Style Encoder for Style-Based GAN Inversion

Feature-Style Encoder for Style-Based GAN Inversion Official implementation for paper: Feature-Style Encoder for Style-Based GAN Inversion. Code will

InterDigital 63 Jan 03, 2023
MAUS: A Dataset for Mental Workload Assessment Using Wearable Sensor - Baseline system

MAUS: A Dataset for Mental Workload Assessment Using Wearable Sensor - Baseline system Getting started To start working on this assignment, you should

2 Aug 06, 2022
A stock generator that assess a list of stocks and returns the best stocks for investing and money allocations based on users choices of volatility, duration and number of stocks

Stock-Generator Please visit "Stock Generator.ipynb" for a clearer view and "Stock Generator.py" for scripts. The stock generator is designed to allow

jmengnyay 1 Aug 02, 2022
[NeurIPS 2021] PyTorch Code for Accelerating Robotic Reinforcement Learning with Parameterized Action Primitives

Robot Action Primitives (RAPS) This repository is the official implementation of Accelerating Robotic Reinforcement Learning via Parameterized Action

Murtaza Dalal 55 Dec 27, 2022
A repository built on the Flow software package to explore cyber-security attacks on intelligent transportation systems.

A repository built on the Flow software package to explore cyber-security attacks on intelligent transportation systems.

George Gunter 4 Nov 14, 2022
A framework for using LSTMs to detect anomalies in multivariate time series data. Includes spacecraft anomaly data and experiments from the Mars Science Laboratory and SMAP missions.

Telemanom (v2.0) v2.0 updates: Vectorized operations via numpy Object-oriented restructure, improved organization Merge branches into single branch fo

Kyle Hundman 844 Dec 28, 2022
[ICCV 2021] Counterfactual Attention Learning for Fine-Grained Visual Categorization and Re-identification

Counterfactual Attention Learning Created by Yongming Rao*, Guangyi Chen*, Jiwen Lu, Jie Zhou This repository contains PyTorch implementation for ICCV

Yongming Rao 90 Dec 31, 2022
[SIGGRAPH Asia 2021] DeepVecFont: Synthesizing High-quality Vector Fonts via Dual-modality Learning.

DeepVecFont This is the homepage for "DeepVecFont: Synthesizing High-quality Vector Fonts via Dual-modality Learning". Yizhi Wang and Zhouhui Lian. WI

Yizhi Wang 17 Dec 22, 2022
Predicting Axillary Lymph Node Metastasis in Early Breast Cancer Using Deep Learning on Primary Tumor Biopsy Slides

Predicting Axillary Lymph Node Metastasis in Early Breast Cancer Using Deep Learning on Primary Tumor Biopsy Slides Project | This repo is the officia

CVSM Group - email: <a href=[email protected]"> 33 Dec 28, 2022
Mosaic of Object-centric Images as Scene-centric Images (MosaicOS) for long-tailed object detection and instance segmentation.

MosaicOS Mosaic of Object-centric Images as Scene-centric Images (MosaicOS) for long-tailed object detection and instance segmentation. Introduction M

Cheng Zhang 27 Oct 12, 2022
MARS: Learning Modality-Agnostic Representation for Scalable Cross-media Retrieva

Introduction This is the source code of our TCSVT 2021 paper "MARS: Learning Modality-Agnostic Representation for Scalable Cross-media Retrieval". Ple

7 Aug 24, 2022
Model-based reinforcement learning in TensorFlow

Bellman Website | Twitter | Documentation (latest) What does Bellman do? Bellman is a package for model-based reinforcement learning (MBRL) in Python,

46 Nov 09, 2022
Towards uncontrained hand-object reconstruction from RGB videos

Towards uncontrained hand-object reconstruction from RGB videos Yana Hasson, Gül Varol, Ivan Laptev and Cordelia Schmid Project page Paper Table of Co

Yana 69 Dec 27, 2022
Visual odometry package based on hardware-accelerated NVIDIA Elbrus library with world class quality and performance.

Isaac ROS Visual Odometry This repository provides a ROS2 package that estimates stereo visual inertial odometry using the Isaac Elbrus GPU-accelerate

NVIDIA Isaac ROS 343 Jan 03, 2023
Expand human face editing via Global Direction of StyleCLIP, especially to maintain similarity during editing.

Oh-My-Face This project is based on StyleCLIP, RIFE, and encoder4editing, which aims to expand human face editing via Global Direction of StyleCLIP, e

AiLin Huang 51 Nov 17, 2022
The repository for freeCodeCamp's YouTube course, Algorithmic Trading in Python

Algorithmic Trading in Python This repository Course Outline Section 1: Algorithmic Trading Fundamentals What is Algorithmic Trading? The Differences

Nick McCullum 1.8k Jan 02, 2023
ppo_pytorch_cpp - an implementation of the proximal policy optimization algorithm for the C++ API of Pytorch

PPO Pytorch C++ This is an implementation of the proximal policy optimization algorithm for the C++ API of Pytorch. It uses a simple TestEnvironment t

Martin Huber 59 Dec 09, 2022