Exploring Classification Equilibrium in Long-Tailed Object Detection, ICCV2021

Overview

Exploring Classification Equilibrium in Long-Tailed Object Detection (LOCE, ICCV 2021)

Paper

Introduction

The conventional detectors tend to make imbalanced classification and suffer performance drop, when the distribution of the training data is severely skewed. In this paper, we propose to use the mean classification score to indicate the classification accuracy for each category during training. Based on this indicator, we balance the classification via an Equilibrium Loss (EBL) and a Memory-augmented Feature Sampling (MFS) method. Specifically, EBL increases the intensity of the adjustment of the decision boundary for the weak classes by a designed score-guided loss margin between any two classes. On the other hand, MFS improves the frequency and accuracy of the adjustments of the decision boundary for the weak classes through over-sampling the instance features of those classes. Therefore, EBL and MFS work collaboratively for finding the classification equilibrium in long-tailed detection, and dramatically improve the performance of tail classes while maintaining or even improving the performance of head classes. We conduct experiments on LVIS using Mask R-CNN with various backbones including ResNet-50-FPN and ResNet-101-FPN to show the superiority of the proposed method. It improves the detection performance of tail classes by 15.6 AP, and outperforms the most recent long-tailed object detectors by more than 1 AP.

Method overview

method overview

Memory-augmented Feature Sampling (MFS)

method overview

Prerequisites

  • MMDetection version 2.8.0.

  • Please see get_started.md for installation and the basic usage of MMDetection.

Train

# assume that you are under the root directory of this project,
# and you have activated your virtual environment if needed.
# and with LVIS v1.0 dataset in 'data/lvis_v1/'.
# use decoupled training pipeline:

# 1. train the model with Mask R-CNN
./tools/dist_train.sh configs/loce/mask_rcnn_r50_fpn_normed_mask_mstrain_2x_lvis_v1.py 8

# 2. fine-tune the model with LOCE
./tools/dist_train.sh configs/loce/loce_mask_rcnn_r50_fpn_normed_mask_mstrain_2x_lvis_v1.py 8

Inference

./tools/dist_test.sh configs/loce/loce_mask_rcnn_r50_fpn_normed_mask_mstrain_2x_lvis_v1.py work_dirs/loce_mask_rcnn_r50_fpn_normed_mask_mstrain_2x_lvis_v1/epoch_6.pth 8 --eval bbox segm

Models

For your convenience, we provide the following trained models (LOCE). All models are trained with 16 images in a mini-batch.

Model Dataset MS train box AP mask AP Pretrained Model LOCE
LOCE_R_50_FPN_2x LVIS v0.5 Yes 28.2 28.4 config / model config / model
LOCE_R_50_FPN_2x LVIS v1.0 Yes 27.4 26.6 config / model config / model
LOCE_R_101_FPN_2x LVIS v1.0 Yes 29.0 28.0 config / model config / model

[0] All results are obtained with a single model and without any test time data augmentation such as multi-scale, flipping and etc..
[1] Refer to more details in config files in config/loce/.

Acknowledgement

Thanks MMDetection team for the wonderful open source project!

Citation

If you find LOCE useful in your research, please consider citing:

@inproceedings{feng2021exploring,
    title={Exploring Classification Equilibrium in Long-Tailed Object Detection},
    author={Feng, Chengjian and Zhong, Yujie and Huang, Weilin},
    booktitle={ICCV},
    year={2021}
}
An Abstract Cyber Security Simulation and Markov Game for OpenAI Gym

gym-idsgame An Abstract Cyber Security Simulation and Markov Game for OpenAI Gym gym-idsgame is a reinforcement learning environment for simulating at

Kim Hammar 29 Dec 03, 2022
Differentiable Wavetable Synthesis

Differentiable Wavetable Synthesis

4 Feb 11, 2022
An unofficial implementation of "Unpaired Image Super-Resolution using Pseudo-Supervision." CVPR2020

UnpairedSR An unofficial implementation of "Unpaired Image Super-Resolution using Pseudo-Supervision." CVPR2020 turn RCAN(modified) -- xmodel(xilinx

JiaKui Hu 10 Oct 28, 2022
This repository contains a PyTorch implementation of "AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis".

AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis | Project Page | Paper | PyTorch implementation for the paper "AD-NeRF: Audio

551 Dec 29, 2022
Fluency ENhanced Sentence-bert Evaluation (FENSE), metric for audio caption evaluation. And Benchmark dataset AudioCaps-Eval, Clotho-Eval.

FENSE The metric, Fluency ENhanced Sentence-bert Evaluation (FENSE), for audio caption evaluation, proposed in the paper "Can Audio Captions Be Evalua

Zhiling Zhang 13 Dec 23, 2022
Run containerized, rootless applications with podman

Why? restrict scope of file system access run any application without root privileges creates usable "Desktop applications" to integrate into your nor

119 Dec 27, 2022
Nested Graph Neural Network (NGNN) is a general framework to improve a base GNN's expressive power and performance

Nested Graph Neural Networks About Nested Graph Neural Network (NGNN) is a general framework to improve a base GNN's expressive power and performance.

Muhan Zhang 38 Jan 05, 2023
This is the repository for The Machine Learning Workshops, published by AI DOJO

This is the repository for The Machine Learning Workshops, published by AI DOJO. It contains all the workshop's code with supporting project files necessary to work through the code.

AI Dojo 12 May 06, 2022
This is a Python wrapper for TA-LIB based on Cython instead of SWIG.

TA-Lib This is a Python wrapper for TA-LIB based on Cython instead of SWIG. From the homepage: TA-Lib is widely used by trading software developers re

John Benediktsson 7.3k Jan 03, 2023
Rendering color and depth images for ShapeNet models.

Color & Depth Renderer for ShapeNet This library includes the tools for rendering multi-view color and depth images of ShapeNet models. Physically bas

Yinyu Nie 41 Dec 19, 2022
Consensus score for tripadvisor

ContripScore ContripScore is essentially a score that combines an Internet platform rating and a consensus rating from sentiment analysis (For instanc

Pepe 1 Jan 13, 2022
SOTR: Segmenting Objects with Transformers [ICCV 2021]

SOTR: Segmenting Objects with Transformers [ICCV 2021] By Ruohao Guo, Dantong Niu, Liao Qu, Zhenbo Li Introduction This is the official implementation

186 Dec 20, 2022
Code for the prototype tool in our paper "CoProtector: Protect Open-Source Code against Unauthorized Training Usage with Data Poisoning".

CoProtector Code for the prototype tool in our paper "CoProtector: Protect Open-Source Code against Unauthorized Training Usage with Data Poisoning".

Zhensu Sun 1 Oct 26, 2021
Multiple custom object count and detection using YOLOv3-Tiny method

Electronic-Component-YOLOv3 Introduce This project created to detect, count, and recognize multiple custom object using YOLOv3-Tiny method. The target

Derwin Mahardika 2 Nov 14, 2022
TeachMyAgent is a testbed platform for Automatic Curriculum Learning methods in Deep RL.

TeachMyAgent: a Benchmark for Automatic Curriculum Learning in Deep RL Paper Website Documentation TeachMyAgent is a testbed platform for Automatic Cu

Flowers Team 51 Dec 25, 2022
Propose a principled and practically effective framework for unsupervised accuracy estimation and error detection tasks with theoretical analysis and state-of-the-art performance.

Detecting Errors and Estimating Accuracy on Unlabeled Data with Self-training Ensembles This project is for the paper: Detecting Errors and Estimating

Jiefeng Chen 13 Nov 21, 2022
Code for the paper "Graph Attention Tracking". (CVPR2021)

SiamGAT 1. Environment setup This code has been tested on Ubuntu 16.04, Python 3.5, Pytorch 1.2.0, CUDA 9.0. Please install related libraries before r

122 Dec 24, 2022
Official Implementation of "Designing an Encoder for StyleGAN Image Manipulation"

Designing an Encoder for StyleGAN Image Manipulation (SIGGRAPH 2021) Recently, there has been a surge of diverse methods for performing image editing

749 Jan 09, 2023
AirLoop: Lifelong Loop Closure Detection

AirLoop This repo contains the source code for paper: Dasong Gao, Chen Wang, Sebastian Scherer. "AirLoop: Lifelong Loop Closure Detection." arXiv prep

Chen Wang 53 Jan 03, 2023
Python implementation of Project Fluent

Project Fluent This is a collection of Python packages to use the Fluent localization system. python-fluent consists of these packages: fluent.syntax

Project Fluent 155 Dec 28, 2022