Pyramid R-CNN: Towards Better Performance and Adaptability for 3D Object Detection

Last update: Jan 07, 2023

Related tags

Deep Learning Pyramid-RCNN

Overview

Pyramid R-CNN

This is a reproduced repo of Pyramid R-CNN for 3D object detection.

The code is mainly based on OpenPCDet.

Introduction

We provide code and training configurations of Pyramid-V/PV on the KITTI and Waymo Open dataset. Checkpoints will not be released. The dataset organization is same with PCDet.

Requirements

The codes are tested in the following environment:

Ubuntu 18.04
Python 3.6
PyTorch 1.5
CUDA 10.1
OpenPCDet v0.3.0
spconv v1.2.1

Installation

a. Clone this repository.

git clone https://github.com/PointsCoder/Pyramid_R-CNN.git

b. Install the dependent libraries as follows:

Install the dependent python libraries:

pip install -r requirements.txt

Install the SparseConv library, we use the implementation from [spconv].
- If you use PyTorch 1.1, then make sure you install the spconv v1.0 with (commit 8da6f96) instead of the latest one.
- If you use PyTorch 1.3+, then you need to install the spconv v1.2. As mentioned by the author of spconv, you need to use their docker if you use PyTorch 1.4+.

c. Compile CUDA operators by running the following command:

python setup.py develop

Training

We train all the models with 8 Tesla V100 GPU (32Gb), and all the configs (epochs/learning rate/batch size) are for 8-GPU Distributed Data Parallel (DDP) training. Users may change those training parameters if they want to run with different GPU numbers and memories.

models

# pyramid_rcnn_pv.yaml: pyramid roi head on the point-voxel backbone
# pyramid_rcnn_v.yaml: pyramid roi head on the spconv u-net backbone

DDP training

CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 sh scripts/dist_train.sh 8 --cfg_file cfgs/waymo_models/pyramid_rcnn_pv.yaml

DDP testing

CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 sh scripts/dist_test.sh 8 --cfg_file cfgs/waymo_models/pyramid_rcnn_pv.yaml --eval_all

Citation

If you find this project useful in your research, please consider cite:

@article{mao2021pyramid,
  title={Pyramid R-CNN: Towards Better Performance and Adaptability for 3D Object Detection},
  author={Mao, Jiageng and Niu, Minzhe and Bai, Haoyue and Liang, Xiaodan and Xu, Hang and Xu, Chunjing},
  journal={ICCV},
  year={2021}
}

Pyramid R-CNN: Towards Better Performance and Adaptability for 3D Object Detection

Related tags

Overview

Pyramid R-CNN

Introduction

Requirements

Installation

Training

Citation

Owner

Toward Realistic Single-View 3D Object Reconstruction with Unsupervised Learning from Multiple Images (ICCV 2021)

Code for the paper "Unsupervised Contrastive Learning of Sound Event Representations", ICASSP 2021.

Adversarial-Information-Bottleneck - Distilling Robust and Non-Robust Features in Adversarial Examples by Information Bottleneck (NeurIPS21)

Deep Watershed Transform for Instance Segmentation

Official implementation of "A Unified Objective for Novel Class Discovery", ICCV2021 (Oral)

Demonstrates iterative FGSM on Apple's NeuralHash model.

PyTorch implementations of the beta divergence loss.

PyTorch code for the "Deep Neural Networks with Box Convolutions" paper

PyTorch implementation of the TTC algorithm

Official code for paper "Demystifying Local Vision Transformer: Sparse Connectivity, Weight Sharing, and Dynamic Weight"

Rainbow DQN implementation that outperforms the paper's results on 40% of games using 20x less data 🌈

Extremely simple and fast extreme multi-class and multi-label classifiers.

IDRLnet, a Python toolbox for modeling and solving problems through Physics-Informed Neural Network (PINN) systematically.

Implementation of the ICCV'21 paper Temporally-Coherent Surface Reconstruction via Metric-Consistent Atlases

OpenDelta - An Open-Source Framework for Paramter Efficient Tuning.

Pytorch implementation of Deep Recursive Residual Network for Super Resolution (DRRN)

An interactive DNN Model deployed on web that predicts the chance of heart failure for a patient with an accuracy of 98%

Conceptual 12M is a dataset containing (image-URL, caption) pairs collected for vision-and-language pre-training.

🔊 Audio and fastai v2

This repository is an official implementation of the paper MOTR: End-to-End Multiple-Object Tracking with TRansformer.