Sparse R-CNN: End-to-End Object Detection with Learnable Proposals, CVPR2021

Overview

Sparse R-CNN: End-to-End Object Detection with Learnable Proposals

License: MIT

Paper (CVPR 2021)

Sparse R-CNN: End-to-End Object Detection with Learnable Proposals

Updates

  • (02/03/2021) Higher performance is reported by using stronger backbone model PVT.
  • (23/02/2021) Higher performance is reported by using stronger pretrain model DetCo.
  • (02/12/2020) Models and logs(R101_100pro_3x and R101_300pro_3x) are available.
  • (26/11/2020) Models and logs(R50_100pro_3x and R50_300pro_3x) are available.
  • (26/11/2020) Higher performance for Sparse R-CNN is reported by setting the dropout rate as 0.0.

Models

Method inf_time train_time box AP download
R50_100pro_3x 23 FPS 19h 42.8 model | log
R50_300pro_3x 22 FPS 24h 45.0 model | log
R101_100pro_3x 19 FPS 25h 44.1 model | log
R101_300pro_3x 18 FPS 29h 46.4 model | log

Models and logs are available in Baidu Drive by code wt9n.

Notes

  • We observe about 0.3 AP noise.
  • The training time is on 8 GPUs with batchsize 16. The inference time is on single GPU. All GPUs are NVIDIA V100.
  • We use the models pre-trained on imagenet using torchvision. And we provide torchvision's ResNet-101.pkl model. More details can be found in the conversion script.
Method inf_time train_time box AP codebase
R50_300pro_3x 22 FPS 24h 45.0 detectron2
R50_300pro_3x.detco 22 FPS 28h 46.5 detectron2
PVTSmall_300pro_3x 13 FPS 50h 45.7 mmdetection
PVTv2-b2_300pro_3x 11 FPS 76h 50.1 mmdetection

Installation

The codebases are built on top of Detectron2 and DETR.

Requirements

  • Linux or macOS with Python ≥ 3.6
  • PyTorch ≥ 1.5 and torchvision that matches the PyTorch installation. You can install them together at pytorch.org to make sure of this
  • OpenCV is optional and needed by demo and visualization

Steps

  1. Install and build libs
git clone https://github.com/PeizeSun/SparseR-CNN.git
cd SparseR-CNN
python setup.py build develop
  1. Link coco dataset path to SparseR-CNN/datasets/coco
mkdir -p datasets/coco
ln -s /path_to_coco_dataset/annotations datasets/coco/annotations
ln -s /path_to_coco_dataset/train2017 datasets/coco/train2017
ln -s /path_to_coco_dataset/val2017 datasets/coco/val2017
  1. Train SparseR-CNN
python projects/SparseRCNN/train_net.py --num-gpus 8 \
    --config-file projects/SparseRCNN/configs/sparsercnn.res50.100pro.3x.yaml
  1. Evaluate SparseR-CNN
python projects/SparseRCNN/train_net.py --num-gpus 8 \
    --config-file projects/SparseRCNN/configs/sparsercnn.res50.100pro.3x.yaml \
    --eval-only MODEL.WEIGHTS path/to/model.pth
  1. Visualize SparseR-CNN
python demo/demo.py\
    --config-file projects/SparseRCNN/configs/sparsercnn.res50.100pro.3x.yaml \
    --input path/to/images --output path/to/save_images --confidence-threshold 0.4 \
    --opts MODEL.WEIGHTS path/to/model.pth

Third-party resources

License

SparseR-CNN is released under MIT License.

Citing

If you use SparseR-CNN in your research or wish to refer to the baseline results published here, please use the following BibTeX entries:

@article{peize2020sparse,
  title   =  {{SparseR-CNN}: End-to-End Object Detection with Learnable Proposals},
  author  =  {Peize Sun and Rufeng Zhang and Yi Jiang and Tao Kong and Chenfeng Xu and Wei Zhan and Masayoshi Tomizuka and Lei Li and Zehuan Yuan and Changhu Wang and Ping Luo},
  journal =  {arXiv preprint arXiv:2011.12450},
  year    =  {2020}
}
Owner
Peize Sun
PhD student, The University of Hong Kong, Computer Vision
Peize Sun
iris - Open Source Photos Platform Powered by PyTorch

Open Source Photos Platform Powered by PyTorch. Submission for PyTorch Annual Hackathon 2021.

Omkar Prabhu 137 Sep 10, 2022
Python implementation of O-OFDMNet, a deep learning-based optical OFDM system,

O-OFDMNet This includes Python implementation of O-OFDMNet, a deep learning-based optical OFDM system, which uses neural networks for signal processin

Thien Luong 4 Sep 09, 2022
CLIP (Contrastive Language–Image Pre-training) for Italian

Italian CLIP CLIP (Radford et al., 2021) is a multimodal model that can learn to represent images and text jointly in the same space. In this project,

Italian CLIP 114 Dec 29, 2022
Unity Propagation in Bayesian Networks Handling Inconsistency via Unity Smoothing

This repository contains the scripts needed to generate the results from the paper Unity Propagation in Bayesian Networks Handling Inconsistency via U

0 Jan 19, 2022
Binary Stochastic Neurons in PyTorch

Binary Stochastic Neurons in PyTorch http://r2rt.com/binary-stochastic-neurons-in-tensorflow.html https://github.com/pytorch/examples/tree/master/mnis

Onur Kaplan 54 Nov 21, 2022
Official code for article "Expression is enough: Improving traffic signal control with advanced traffic state representation"

1 Introduction Official code for article "Expression is enough: Improving traffic signal control with advanced traffic state representation". The code s

Liang Zhang 10 Dec 10, 2022
Fuzzy Overclustering (FOC)

Fuzzy Overclustering (FOC) In real-world datasets, we need consistent annotations between annotators to give a certain ground-truth label. However, in

2 Nov 08, 2022
Official implementation of DreamerPro: Reconstruction-Free Model-Based Reinforcement Learning with Prototypical Representations in TensorFlow 2

DreamerPro Official implementation of DreamerPro: Reconstruction-Free Model-Based Reinforcement Learning with Prototypical Representations in TensorFl

22 Nov 01, 2022
LUKE -- Language Understanding with Knowledge-based Embeddings

LUKE (Language Understanding with Knowledge-based Embeddings) is a new pre-trained contextualized representation of words and entities based on transf

Studio Ousia 587 Dec 30, 2022
This YoloV5 based model is fit to detect people and different types of land vehicles, and displaying their density on a fitted map, according to their coordinates and detected labels.

This YoloV5 based model is fit to detect people and different types of land vehicles, and displaying their density on a fitted map, according to their

Liron Bdolah 8 May 22, 2022
Unofficial pytorch implementation of 'Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization'

pytorch-AdaIN This is an unofficial pytorch implementation of a paper, Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization [Hua

Naoto Inoue 873 Jan 06, 2023
[CVPR 2021] Unsupervised Degradation Representation Learning for Blind Super-Resolution

DASR Pytorch implementation of "Unsupervised Degradation Representation Learning for Blind Super-Resolution", CVPR 2021 [arXiv] Overview Requirements

Longguang Wang 318 Dec 24, 2022
Semi-supevised Semantic Segmentation with High- and Low-level Consistency

Semi-supevised Semantic Segmentation with High- and Low-level Consistency This Pytorch repository contains the code for our work Semi-supervised Seman

123 Dec 30, 2022
Official PyTorch implementation of "IntegralAction: Pose-driven Feature Integration for Robust Human Action Recognition in Videos", CVPRW 2021

IntegralAction: Pose-driven Feature Integration for Robust Human Action Recognition in Videos Introduction This repo is official PyTorch implementatio

Gyeongsik Moon 29 Sep 24, 2022
Experimental code for paper: Generative Adversarial Networks as Variational Training of Energy Based Models

Experimental code for paper: Generative Adversarial Networks as Variational Training of Energy Based Models, under review at ICLR 2017 requirements: T

Shuangfei Zhai 18 Mar 05, 2022
Transfer SemanticKITTI labeles into other dataset/sensor formats.

LiDAR-Transfer Transfer SemanticKITTI labeles into other dataset/sensor formats. Content Convert datasets (NUSCENES, FORD, NCLT) to KITTI format Minim

Photogrammetry & Robotics Bonn 64 Nov 21, 2022
A playable implementation of Fully Convolutional Networks with Keras.

keras-fcn A re-implementation of Fully Convolutional Networks with Keras Installation Dependencies keras tensorflow Install with pip $ pip install git

JihongJu 202 Sep 07, 2022
PyTorch implementation of our ICCV 2019 paper: Liquid Warping GAN: A Unified Framework for Human Motion Imitation, Appearance Transfer and Novel View Synthesis

Impersonator PyTorch implementation of our ICCV 2019 paper: Liquid Warping GAN: A Unified Framework for Human Motion Imitation, Appearance Transfer an

SVIP Lab 1.7k Jan 06, 2023
EM-POSE 3D Human Pose Estimation from Sparse Electromagnetic Trackers.

EM-POSE: 3D Human Pose Estimation from Sparse Electromagnetic Trackers This repository contains the code to our paper published at ICCV 2021. For ques

Facebook Research 62 Dec 14, 2022
:hot_pepper: R²SQL: "Dynamic Hybrid Relation Network for Cross-Domain Context-Dependent Semantic Parsing." (AAAI 2021)

R²SQL The PyTorch implementation of paper Dynamic Hybrid Relation Network for Cross-Domain Context-Dependent Semantic Parsing. (AAAI 2021) Requirement

huybery 60 Dec 31, 2022