LiDAR R-CNN: An Efficient and Universal 3D Object Detector

Last update: Jan 05, 2023

Overview

LiDAR R-CNN: An Efficient and Universal 3D Object Detector

Introduction

This is the official code of LiDAR R-CNN: An Efficient and Universal 3D Object Detector. In this work, we present LiDAR R-CNN, a second stage detector that can generally improve any existing 3D detector. We find a common problem in Point-based RCNN, which is the learned features ignore the size of proposals, and propose several methods to remedy it. Evaluated on WOD benchmarks, our method significantly outperforms previous state-of-the-art.

中文介绍：https://zhuanlan.zhihu.com/p/359800738

Requirements

All the codes are tested in the following environment:

Linux (tested on Ubuntu 16.04)
Python 3.6+
PyTorch 1.5 or higher (tested on PyTorch 1.5, 6, 7)
CUDA 10.1

To install pybind11:

git clone [email protected]:pybind/pybind11.git
cd pybind11
mkdir build && cd build
cmake .. && make -j 
sudo make install

To install requirements:

pip install -r requirements.txt
apt-get install ninja-build libeigen3-dev

Install LiDAR_RCNN library:

python setup.py develop --user

Preparing Data

Please refer to data processer to generate the proposal data.

Training

After preparing WOD data, we can train the vehicle only model in the paper, run this command:

python -m torch.distributed.launch --nproc_per_node=4 tools/train.py --cfg config/lidar_rcnn.yaml --name lidar_rcnn

For 3 class in WOD:

python -m torch.distributed.launch --nproc_per_node=8 tools/train.py --cfg config/lidar_rcnn_all_cls.yaml --name lidar_rcnn_all

The models and logs will be saved to work_dirs/outputs.

Evaluation

To evaluate, run distributed testing with 4 gpus:

python -m torch.distributed.launch --nproc_per_node=4 tools/test.py --cfg config/lidar_rcnn.yaml --checkpoint outputs/lidar_rcnn/checkpoint_lidar_rcnn_59.pth.tar
python tools/create_results.py --cfg config/lidar_rcnn.yaml

Note that, you should keep the nGPUS in config equal to nproc_per_node .This will generate a val.bin file in the work_dir/results. You can create submission to Waymo server using waymo-open-dataset code by following the instructions here.

Results

Our model achieves the following performance on:

Waymo Open Dataset Challenges (3D Detection)

Proposals from	Class	Channel	3D AP L1 Vehicle	3D AP L1 Pedestrian	3D AP L1 Cyclist
PointPillars	Vehicle	1x	75.6	-	-
PointPillars	Vehicle	2x	75.6	-	-
PointPillars	3 Class	1x	73.4	70.7	67.4
PointPillars	3 Class	2x	73.8	71.9	69.4

Proposals from	Class	Channel	3D AP L2 Vehicle	3D AP L2 Pedestrian	3D AP L2 Cyclist
PointPillars	Vehicle	1x	66.8	-	-
PointPillars	Vehicle	2x	67.9	-	-
PointPillars	3 Class	1x	64.8	62.4	64.8
PointPillars	3 Class	2x	65.1	63.5	66.8

Citation

If you find our paper or repository useful, please consider citing

@article{li2021lidar,
  title={LiDAR R-CNN: An Efficient and Universal 3D Object Detector},
  author={Li, Zhichao and Wang, Feng and Wang, Naiyan},
  journal={CVPR},
  year={2021},
}

Acknowledgement

Comments

How is the PP model trained

This model file checkpoints/hv_pointpillars_secfpn_sbn_2x16_2x_waymo-3d-car-9fa20624.pth in the docs cannot be found in mmdet3d official repo (they only have the interval-5 pretrained models). Are the proposals extracted with interval-1 models: 3d-car and 3d-3class? If I want to reproduce your results, do I need to first train with these two configs? Thanks.

opened by haotian-liu 21
checkpoint shape error

hi~ Zhichao Li /Feng Wang/ Naiyan Wang~

I am very interested in your work LIDAR RCNN, but when I use the LIDAR RCNN pretrained model you gave me checkpoint_lidar_rcnn_59.pth.tar(MD5:6416c502af3cb73f0c39dd0cabdee2cb, I found that the weights of the pretrained model are 9 dimensions, but your input data is 12 dimensions.

Can you provide me a pretrained model whose dimensions are correctly matched.

I found that in one of your commits, the dimension was increased from 9 to 12 dimensions, but the latest pre-trained model is still 9 dimensions

opened by hutao568 11
Transfered To Nuscenes Dataset，Performance decline

When I transfered it to the CenterpointNet and nuscenes datasets, Then evaluated on nuscense, it didn’t seem to work. I don’t know what went wrong, Looking forward to your suggestions and comments.

opened by Suodislie 9
Run inference on single GPU
Hi, I am able to do all setup as per instructions given in README In the evaluation step,

python -m torch.distributed.launch --nproc_per_node=4 tools/test.py --cfg config/lidar_rcnn.yaml --checkpoint outputs/lidar_rcnn/checkpoint_lidar_rcnn_59.pth.tar python tools/create_results.py --cfg config/lidar_rcnn.yaml

I am facing the following questions while running the evaluation.

How to change the command to run a single GPU, nproc_per_node needs to be 1.

What should be MODEL.Frame number for checkpoint_lidar_rcnn_59.pth.tar? Since I am trying to understand the evaluation, kindly help me on this to fix.
opened by kamalasubha 7
The cls scores are useless on my own dataset

Thanks for your awesome works. When I use Lidar-RCNN on my own dataset, the refine score is useless, Most objects are classified as backgrounds. In addition, the average refined center error is only reduced by 1 cm. I don't know Is this normal?

opened by xiuzhizheng 6
What processes in LIDAR-RCNN are specific for waymo dataset?

Hello, just like the title saying, I wonder what are the specific processes for WOD, which means if I want to use LIDAR R-CNN on my own dataset, I have to do it differently. I already change the data_processor and everything I can think of in the loader and creat_results that are respect to waymo dataset, then I use the refined results to perform evaluation on my own dataset. However, I got NAN on rotation error, and the MAP is pretty low.

Therefore, I'm confused about some subtle processes that are performed just for waymo not for other datasets. For example, compute heading residual is necessary for using LiDAR R-CNN? Did you guys use rotation in some sublte ways? （In my dataset, the rotation is according to y axis, while in your code, it's x axis, but the way of computing rotZ is the same, I already changed it.）

This bug has been driving me crazy, that's why my issue description above is a bit messy, forgive me please. I would be grateful if you could provide me some hints. Thank you a lot. Save this almost desperate kid, please.🥺

opened by QingXIA233 6
The num of boxes of matching_gt_bbox is more than that of valid_gt?

Hello, sorry I come back with another question...... Recently, I've been working on using LiDAR R-CNN to refine the results of the CenterPoint-PP model with my own dataset. During data processing for my own dataset, I notice that the results of my CenterPoint-PP model has more bboxes detected than the ground truth ones (false detection case). When performing get_matching_by_iou function in LiDAR R-CNN, the obtained matching_gt_bbox has the same number of bboxes as the model predictions instead of the groundtruth data. I'm a bit confused about this process. Now that we are trying to do refinement, shouldn't we remove the falsely detected bboxes in the results and keep to the groundtruth? If so, why the matching bboxes is according to the predictions instead of groundtruth?

Maybe I have some misunderstandings here, it would be a great helper if you could give me some hints. Thanks in advance.

opened by QingXIA233 6
The pretrained model

Hi, I am very interested in your paper, and I am reproducing it. The pretrained model of pointpillar provided in mmdetection3d does not reach the performance shown in the Table 2 below, so could you please provide the pretrained model of pointpillar in Table 2? Thank you very much!

opened by SSY-1276 6
About train one iter data

Hi~Sorry to bother you again！ Is that right? The prediction frames of all frames are extracted at one time and then disrupted globally, which means that when lidar RCNN trains a batch, it contains different boxes of different frames. When the batchsize is 256, the extreme case may contain up to 256 frames, and each frame takes a box. Below is my idea！ If I train two frames at a time, extract proposals through the frozen one-stage network, and then use lidarcnn for end-to-end training, is it ok？Do u have an idea about how to design the ROI sampler ratio?

opened by DongfeiJi 6
Collaboration with MMDetection3D

Hi developers of LiDAR R-CNN,

Congrats on the acceptance of the paper!

LiDAR R-CNN achieves new state-of-the-art results through simple yet effective improvement, which is very insightful to the community. We also found that the baseline is based on the implementations in MMDetection3D.

Therefore, I am coming to ask, as we believe LiDAR R-CNN might have a great impact on the community, would you like to also contribute an implementation of LiDAR R-CNN to MMDetection3D? If so, maybe we could have a more detailed discussion about that? MMDetection3D welcomes any kind of contribution. Please feel free to ask if there is anything from the MMDet3D team that could help.

On behalf of the MMDet3D Development Team

BR,

Wenwei

opened by ZwwWayne 6
checkpoint shape error

hi~ Zhichao Li /Feng Wang/ Naiyan Wang~ 我对你们的工作LIDAR RCNN非常感兴趣，但是我在使用您给我的LIDAR RCNN预训练模型checkpoint_lidar_rcnn_59.pth.tar(MD5:6416c502af3cb73f0c39dd0cabdee2cb 时，发现预训练模型的权重是9维，但是你们的输入数据是12维12维您可以提供给我维度可以正确匹配的预训练模型吗

opened by hutao568 4

Releases(v0.1.1)

v0.1.1(Jun 11, 2021)
Add version for mmdetection3d.

Add eigen in dependency.

Source code(tar.gz)
Source code(zip)
v0.1(Jun 1, 2021)

Source code(tar.gz)
Source code(zip)

Owner

TuSimple

The Future of Trucking

GitHub Repository

GenGNN: A Generic FPGA Framework for Graph Neural Network Acceleration

GenGNN: A Generic FPGA Framework for Graph Neural Network Acceleration Stefan Abi-Karam*, Yuqi He*, Rishov Sarkar*, Lakshmi Sathidevi, Zihang Qiao, Co

19 Dec 15, 2022

Code + pre-trained models for the paper Keeping Your Eye on the Ball Trajectory Attention in Video Transformers

Motionformer This is an official pytorch implementation of paper Keeping Your Eye on the Ball: Trajectory Attention in Video Transformers. In this rep

192 Dec 23, 2022

Dialect classification

Dialect-Classification This repository presents the data that was used in a talk at ICKL-5 (5th International Conference on Kurdish Linguistics) at th

0 Nov 12, 2021

Jittor implementation of Recursive-NeRF: An Efficient and Dynamically Growing NeRF

Recursive-NeRF: An Efficient and Dynamically Growing NeRF This is a Jittor implementation of Recursive-NeRF: An Efficient and Dynamically Growing NeRF

33 Nov 30, 2022

Official implementation of the ICLR 2021 paper

You Only Need Adversarial Supervision for Semantic Image Synthesis Official PyTorch implementation of the ICLR 2021 paper "You Only Need Adversarial S

272 Dec 28, 2022

Pytorch Performace Tuning, WandB, AMP, Multi-GPU, TensorRT, Triton

Plant Pathology 2020 FGVC7 Introduction A deep learning model pipeline for training, experimentaiton and deployment for the Kaggle Competition, Plant

0 Feb 25, 2022

On the model-based stochastic value gradient for continuous reinforcement learning

On the model-based stochastic value gradient for continuous reinforcement learning This repository is by Brandon Amos, Samuel Stanton, Denis Yarats, a

46 Dec 15, 2022

Human head pose estimation using Keras over TensorFlow.

RealHePoNet: a robust single-stage ConvNet for head pose estimation in the wild.

71 Jan 05, 2023

Captcha-tensorflow - Image Captcha Solving Using TensorFlow and CNN Model. Accuracy 90%+

Captcha Solving Using TensorFlow Introduction Solve captcha using TensorFlow. Learn CNN and TensorFlow by a practical project. Follow the steps, run t

869 Jan 06, 2023

simple artificial intelligence utilities

Simple AI Project home: http://github.com/simpleai-team/simpleai This lib implements many of the artificial intelligence algorithms described on the b

921 Dec 08, 2022

Banglore House Prediction Using Flask Server (Python)

Banglore House Prediction Using Flask Server (Python) 🌐 Links 🌐 📂 Repo In this repository, I've implemented a Machine Learning-based Bangalore Hous

1 Jan 24, 2022

This repository is the official implementation of Using Time-Series Privileged Information for Provably Efficient Learning of Prediction Models

Using Time-Series Privileged Information for Provably Efficient Learning of Prediction Models Link to paper Abstract We study prediction of future out

2 Aug 19, 2022

Source code for EquiDock: Independent SE(3)-Equivariant Models for End-to-End Rigid Protein Docking (ICLR 2022)

Source code for EquiDock: Independent SE(3)-Equivariant Models for End-to-End Rigid Protein Docking (ICLR 2022) Please cite "Independent SE(3)-Equivar

154 Jan 02, 2023

Structure Information is the Key: Self-Attention RoI Feature Extractor in 3D Object Detection

Structure Information is the Key: Self-Attention RoI Feature Extractor in 3D Object Detection abstract:Unlike 2D object detection where all RoI featur

2 Oct 07, 2022

TorchMetrics is a collection of 25+ PyTorch metrics implementations and an easy-to-use API to create custom metrics.

Machine learning metrics for distributed, scalable PyTorch applications.

1.2k Jan 06, 2023

Supervised & unsupervised machine-learning techniques are applied to the database of weighted P4s which admit Calabi-Yau hypersurfaces.

Weighted Projective Spaces ML Description: The database of 5-vectors describing 4d weighted projective spaces which admit Calabi-Yau hypersurfaces are

3 Sep 08, 2022

Official implementation of Pixel-Level Bijective Matching for Video Object Segmentation

BMVOS This is the official implementation of Pixel-Level Bijective Matching for Video Object Segmentation, to appear in WACV 2022. @article{cho2021pix

13 Dec 14, 2022

Fully Adaptive Bayesian Algorithm for Data Analysis (FABADA) is a new approach of noise reduction methods. In this repository is shown the package developed for this new method based on \citepaper.

Fully Adaptive Bayesian Algorithm for Data Analysis FABADA FABADA is a novel non-parametric noise reduction technique which arise from the point of vi

18 Oct 20, 2022

Implementation of Deformable Attention in Pytorch from the paper "Vision Transformer with Deformable Attention"

Deformable Attention Implementation of Deformable Attention from this paper in Pytorch, which appears to be an improvement to what was proposed in DET

128 Dec 24, 2022

A curated list of Generative Deep Art projects, tools, artworks, and models

Generative Deep Art A curated list of Generative Deep Art projects, tools, artworks, and models Inbox Get started with making AI art in 2022 – deeplea

251 Jan 03, 2023

LiDAR R-CNN: An Efficient and Universal 3D Object Detector

Related tags

Overview

LiDAR R-CNN: An Efficient and Universal 3D Object Detector

Introduction

Requirements

Preparing Data

Training

Evaluation

Results

Citation

Acknowledgement

Comments

Releases(v0.1.1)

v0.1.1(Jun 11, 2021)

v0.1(Jun 1, 2021)

Owner

TuSimple

GenGNN: A Generic FPGA Framework for Graph Neural Network Acceleration

Code + pre-trained models for the paper Keeping Your Eye on the Ball Trajectory Attention in Video Transformers

Dialect classification

Jittor implementation of Recursive-NeRF: An Efficient and Dynamically Growing NeRF

Official implementation of the ICLR 2021 paper

Pytorch Performace Tuning, WandB, AMP, Multi-GPU, TensorRT, Triton

On the model-based stochastic value gradient for continuous reinforcement learning

Human head pose estimation using Keras over TensorFlow.

Captcha-tensorflow - Image Captcha Solving Using TensorFlow and CNN Model. Accuracy 90%+

simple artificial intelligence utilities

Banglore House Prediction Using Flask Server (Python)

This repository is the official implementation of Using Time-Series Privileged Information for Provably Efficient Learning of Prediction Models

Source code for EquiDock: Independent SE(3)-Equivariant Models for End-to-End Rigid Protein Docking (ICLR 2022)

Structure Information is the Key: Self-Attention RoI Feature Extractor in 3D Object Detection

TorchMetrics is a collection of 25+ PyTorch metrics implementations and an easy-to-use API to create custom metrics.

Supervised & unsupervised machine-learning techniques are applied to the database of weighted P4s which admit Calabi-Yau hypersurfaces.

Official implementation of Pixel-Level Bijective Matching for Video Object Segmentation

Fully Adaptive Bayesian Algorithm for Data Analysis (FABADA) is a new approach of noise reduction methods. In this repository is shown the package developed for this new method based on \citepaper.

Implementation of Deformable Attention in Pytorch from the paper "Vision Transformer with Deformable Attention"

A curated list of Generative Deep Art projects, tools, artworks, and models