Official PyTorch implementation of RIO

Overview

NVIDIA Source Code License Python 3.6

Image-Level or Object-Level? A Tale of Two Resampling Strategies for Long-Tailed Detection

Figure 1: Our proposed Resampling at image-level and obect-level (RIO).

Project page | Paper

Image-Level or Object-Level? A Tale of Two Resampling Strategies for Long-Tailed Detection.
Nadine Chang, Zhiding Yu, Yu-Xiong Wang, Anima Anandkumar, Sanja Fidler, Jose M. Alvarez.
ICML 2021.

This repository contains the official Pytorch implementation of training & evaluation code and the pretrained models for RIO.

Abstract

Training on datasets with long-tailed distributions has been challenging for major recognition tasks such as classification and detection. To deal with this challenge, image resampling is typically introduced as a simple but effective approach. However, we observe that long-tailed detection differs from classification since multiple classes may be present in one image. As a result, image resampling alone is not enough to yield a sufficiently balanced distribution at the object level. We address object-level resampling by introducing an object-centric memory replay strategy based on dynamic, episodic memory banks. Our proposed strategy has two benefits: 1) convenient object-level resampling without significant extra computation, and 2) implicit feature-level augmentation from model updates. We show that image-level and object-level resamplings are both important, and thus unify them with a joint resampling strategy (RIO). Our method outperforms state-of-the-art long-tailed detection and segmentation methods on LVIS v0.5 across various backbones.

Requirements

  • Linux or maxOS with Python >= 3.6
  • PyTorch >= 1.5 and torchvision corresponding to PyTorch installation. Please refer to download guildlines at the PyTorch website
  • Detectron2
  • OpenCV is optional but required for visualizations

Installation

Detectron2

Please refer to the installation instructions in Detectron2.

We use Detectron2 v0.3 as the codebase. Thus, we advise installing Detectron2 from a clone of this repository.

LVIS Dataset

Dataset download is available at the official LVIS website. Please follow Detectron's guildlines on expected LVIS dataset structure.

Our Setup

  • Python 3.6.9
  • PyTorch 1.5.0 with CUDA 10.2
  • Detectron2 built from this repository.

Pretrained Models

Detection and Instance Segmentation on LVIS v0.5

Backbone Method AP.b AP.b.r AP.b.c AP.b.f AP.m AP.m.r AP.m.c AP.m.f download
R50-FPN MaskRCNN-RIO 25.7 17.2 25.1 29.8 26.0 18.9 26.2 28.5 model
R101-FPN MaskRCNN-RIO 27.3 19.1 26.8 31.2 27.7 20.1 28.3 30.0 model
X101-FPN MaskRCNN-RIO 28.6 19.0 28.0 33.0 28.9 19.5 29.7 31.6 model

Training & Evaluation

Our code is located under projects/RIO.

Our training and evaluation follows those of Detectron2's. We've provided config files for both LVISv0.5 and LVISv1.0.

Example: Training LVISv0.5 on Mask-RCNN ResNet-50

# We advise multi-gpu training
cd projects/RIO
python memory_train_net.py \
--num-gpus 4 \
--config-file=configs/LVISv0.5-InstanceSegmentation/memory_mask_rcnn_R_50_FPN_1x.yaml 

Example: Evaluating LVISv0.5 on Mask-RCNN ResNet-50

cd projects/RIO
python memory_train_net.py \
--eval-only MODEL.WEIGHTS /path/to/model_checkpoint \
--config-file configs/LVISv0.5-InstanceSegmentation/memory_mask_rcnn_R_50_FPN_1x.yaml  

By default, LVIS evaluation follows immediately after training.

Visualization

Detectron2 has built-in visualization tools. Under tools folder, visualize_json_results.py can be used to visualize the json instance detection/segmentation results given by LVISEvaluator.

python visualize_json_results.py --input x.json --output dir/ --dataset lvis

Further information can be found on Detectron2 tools' README.

License

Please check the LICENSE file. RIO may be used non-commercially, meaning for research or evaluation purposes only. For business inquiries, please contact [email protected].

Citation

@article{chang2021image,
  title={Image-Level or Object-Level? A Tale of Two Resampling Strategies for Long-Tailed Detection},
  author={Chang, Nadine and Yu, Zhiding and Wang, Yu-Xiong and Anandkumar, Anima and Fidler, Sanja and Alvarez, Jose M},
  journal={arXiv preprint arXiv:2104.05702},
  year={2021}
}
Random Forests for Regression with Missing Entries

Random Forests for Regression with Missing Entries These are specific codes used in the article: On the Consistency of a Random Forest Algorithm in th

Irving Gómez-Méndez 1 Nov 15, 2021
Simple Tensorflow implementation of "Adaptive Convolutions for Structure-Aware Style Transfer" (CVPR 2021)

AdaConv — Simple TensorFlow Implementation [Paper] : Adaptive Convolutions for Structure-Aware Style Transfer (CVPR 2021) Note This repository does no

Junho Kim 26 Nov 18, 2022
A tool for calculating distortion parameters in coordination complexes.

OctaDist Octahedral distortion calculator: A tool for calculating distortion parameters in coordination complexes. https://octadist.github.io/ Registe

OctaDist 12 Oct 04, 2022
An NVDA add-on to split screen reader and audio from other programs to different sound channels

An NVDA add-on to split screen reader and audio from other programs to different sound channels (add-on idea credit: Tony Malykh)

Joseph Lee 7 Dec 25, 2022
Implementation for Stankevičiūtė et al. "Conformal time-series forecasting", NeurIPS 2021.

Conformal time-series forecasting Implementation for Stankevičiūtė et al. "Conformal time-series forecasting", NeurIPS 2021. If you use our code in yo

Kamilė Stankevičiūtė 36 Nov 21, 2022
Cross Quality LFW: A database for Analyzing Cross-Resolution Image Face Recognition in Unconstrained Environments

Cross-Quality Labeled Faces in the Wild (XQLFW) Here, we release the database, evaluation protocol and code for the following paper: Cross Quality LFW

Martin Knoche 10 Dec 12, 2022
[SIGGRAPH 2020] Attribute2Font: Creating Fonts You Want From Attributes

Attr2Font Introduction This is the official PyTorch implementation of the Attribute2Font: Creating Fonts You Want From Attributes. Paper: arXiv | Rese

Yue Gao 200 Dec 15, 2022
v objective diffusion inference code for JAX.

v-diffusion-jax v objective diffusion inference code for JAX, by Katherine Crowson (@RiversHaveWings) and Chainbreakers AI (@jd_pressman). The models

Katherine Crowson 186 Dec 21, 2022
Official Implementation of DDOD (Disentangle your Dense Object Detector), ACM MM2021

Disentangle Your Dense Object Detector This repo contains the supported code and configuration files to reproduce object detection results of Disentan

loveSnowBest 51 Jan 07, 2023
CARMS: Categorical-Antithetic-REINFORCE Multi-Sample Gradient Estimator

CARMS: Categorical-Antithetic-REINFORCE Multi-Sample Gradient Estimator This is the official code repository for NeurIPS 2021 paper: CARMS: Categorica

Alek Dimitriev 1 Jul 09, 2022
[peer review] An Arbitrary Scale Super-Resolution Approach for 3D MR Images using Implicit Neural Representation

ArSSR This repository is the pytorch implementation of our manuscript "An Arbitrary Scale Super-Resolution Approach for 3-Dimensional Magnetic Resonan

Qing Wu 19 Dec 12, 2022
Face Mesh is a face geometry solution that estimates 468 3D face landmarks in real-time even on mobile devices

Face-Mesh Face Mesh is a face geometry solution that estimates 468 3D face landmarks in real-time even on mobile devices. It employs machine learning

Farnam Javadi 9 Dec 21, 2022
Repositório da disciplina de APC, no segundo semestre de 2021

NOTAS FINAIS: https://github.com/fabiommendes/apc2018/blob/master/nota-final.pdf Algoritmos e Programação de Computadores Este é o Git da disciplina A

16 Dec 16, 2022
CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation

[ICCV2021] TransReID: Transformer-based Object Re-Identification [pdf] The official repository for TransReID: Transformer-based Object Re-Identificati

DamoCV 569 Dec 30, 2022
Activity image-based video retrieval

Cross-modal-retrieval Our approach is focus on Activity Image-to-Video Retrieval (AIVR) task. The compared methods are state-of-the-art single modalit

BCMI 75 Oct 21, 2021
Code for DeepXML: A Deep Extreme Multi-Label Learning Framework Applied to Short Text Documents

DeepXML Code for DeepXML: A Deep Extreme Multi-Label Learning Framework Applied to Short Text Documents Architectures and algorithms DeepXML supports

Extreme Classification 49 Nov 06, 2022
Code for 2021 NeurIPS --- Towards Multi-Grained Explainability for Graph Neural Networks

ReFine: Multi-Grained Explainability for GNNs This is the official code for Towards Multi-Grained Explainability for Graph Neural Networks (NeurIPS 20

Shirley (Ying-Xin) Wu 47 Dec 16, 2022
Open Source Differentiable Computer Vision Library for PyTorch

Kornia is a differentiable computer vision library for PyTorch. It consists of a set of routines and differentiable modules to solve generic computer

kornia 7.6k Jan 04, 2023
Single-Stage 6D Object Pose Estimation, CVPR 2020

Overview This repository contains the code for the paper Single-Stage 6D Object Pose Estimation. Yinlin Hu, Pascal Fua, Wei Wang and Mathieu Salzmann.

CVLAB @ EPFL 89 Dec 26, 2022
Repository for the paper : Meta-FDMixup: Cross-Domain Few-Shot Learning Guided byLabeled Target Data

1 Meta-FDMIxup Repository for the paper : Meta-FDMixup: Cross-Domain Few-Shot Learning Guided byLabeled Target Data. (ACM MM 2021) paper News! the rep

Fu Yuqian 44 Nov 18, 2022