[CVPR'21] MonoRUn: Monocular 3D Object Detection by Reconstruction and Uncertainty Propagation

Overview

MonoRUn

MonoRUn: Monocular 3D Object Detection by Reconstruction and Uncertainty Propagation. CVPR 2021. [paper] Hansheng Chen, Yuyao Huang, Wei Tian*, Zhong Gao, Lu Xiong. (*Corresponding author: Wei Tian.)

This repository is the PyTorch implementation for MonoRUn. The codes are based on MMDetection and MMDetection3D, although we use our own data formats. The PnP C++ codes are modified from PVNet.

demo

Installation

Please refer to INSTALL.md.

Data preparation

Download the official KITTI 3D object dataset, including left color images, calibration files and training labels.

Download the train/val/test image lists [Google Drive | Baidu Pan, password: cj4u]. For training with LiDAR supervision, download the preprocessed object coordinate maps [Google Drive | Baidu Pan, password: fp3h].

Extract the downloaded archives according to the following folder structure. It is recommended to symlink the dataset root to $MonoRUn_ROOT/data. If your folder structure is different, you may need to change the corresponding paths in config files.

$MonoRUn_ROOT
├── configs
├── monorun
├── tools
├── data
│   ├── kitti
│   │   ├── testing
│   │   │   ├── calib
│   │   │   ├── image_2
│   │   │   └── test_list.txt
│   │   └── training
│   │       ├── calib
│   │       ├── image_2
│   │       ├── label_2
│   │       ├── obj_crd
│   │       ├── mono3dsplit_train_list.txt
│   │       ├── mono3dsplit_val_list.txt
│   │       └── trainval_list.txt

Run the preparation script to generate image metas:

cd $MonoRUn_ROOT
python tools/prepare_kitti.py

Train

cd $MonoRUn_ROOT

To train without LiDAR supervision:

python train.py configs/kitti_multiclass.py --gpu-ids 0 1

where --gpu-ids 0 1 specifies the GPU IDs. In the paper we use two GPUs for distributed training. The number of GPUs affects the mini-batch size. You may change the samples_per_gpu option in the config file to vary the number of images per GPU. If you encounter out of memory issue, add the argument --seed 0 --deterministic to save GPU memory.

To train with LiDAR supervision:

python train.py configs/kitti_multiclass_lidar_supv.py --gpu-ids 0 1

To view other training options:

python train.py -h

By default, logs and checkpoints will be saved to $MonoRUn_ROOT/work_dirs. You can run TensorBoard to plot the logs:

tensorboard --logdir $MonoRUn_ROOT/work_dirs

The above configs use the 3712-image split for training and the other split for validating. If you want to train on the full training set (train-val), use the config files with _trainval postfix.

Test

You can download the pretrained models:

  • kitti_multiclass.pth [Google Drive | Baidu Pan, password: 6bih] trained on KITTI training split
  • kitti_multiclass_lidar_supv.pth [Google Drive | Baidu Pan, password: nmdb] trained on KITTI training split
  • kitti_multiclass_lidar_supv_trainval.pth [Google Drive | Baidu Pan, password: hg2r] trained on KITTI train-val

To test and evaluate on the validation set using config at $CONFIG_PATH and checkpoint at $CPT_PATH:

python test.py $CONFIG_PATH $CPT_PATH --val-set --gpu-ids 0

To test on the test set and save detection results to $RESULT_DIR:

python test.py $CONFIG_PATH $CPT_PATH --result-dir $RESULT_DIR --gpu-ids 0

You can append the argument --show-dir $SHOW_DIR to save visualized results.

To view other testing options:

python test.py -h

Note: the training and testing scripts in the root directory are wrappers for the original scripts taken from MMDetection, which can be found in $MonoRUn_ROOT/tools. For advanced usage, please refer to the official MMDetection docs.

Demo

We provide a demo script to perform inference on images in a directory and save the visualized results. Example:

python demo/infer_imgs.py $KITTI_RAW_DIR/2011_09_30/2011_09_30_drive_0027_sync/image_02/data configs/kitti_multiclass_lidar_supv_trainval.py checkpoints/kitti_multiclass_lidar_supv_trainval.pth --calib demo/calib.csv --show-dir show/2011_09_30_drive_0027

Citation

If you find this project useful in your research, please consider citing:

@inproceedings{monorun2021, 
  author = {Hansheng Chen and Yuyao Huang and Wei Tian and Zhong Gao and Lu Xiong}, 
  title = {MonoRUn: Monocular 3D Object Detection by Reconstruction and Uncertainty Propagation}, 
  booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}, 
  year = {2021}
}
Owner
同济大学智能汽车研究所综合感知研究组 ( Comprehensive Perception Research Group under Institute of Intelligent Vehicles, School of Automotive Studies, Tongji University)
同济大学智能汽车研究所综合感知研究组 ( Comprehensive Perception Research Group under Institute of Intelligent Vehicles, School of Automotive Studies, Tongji University)
prior-based-losses-for-medical-image-segmentation

Repository for papers: Benchmark: Effect of Prior-based Losses on Segmentation Performance: A Benchmark Midl: A Surprisingly Effective Perimeter-based

Rosana EL JURDI 9 Sep 07, 2022
Miscellaneous and lightweight network tools

Network Tools Collection of miscellaneous and lightweight network tools to simplify daily operations, administration, and troubleshooting of networks.

Nicholas Russo 22 Mar 22, 2022
Transformer based SAR image despeckling

Transformer based SAR image despeckling Using the code: The code is stable while using Python 3.6.13, CUDA =10.1 Clone this repository: git clone htt

27 Nov 13, 2022
Generate high quality pictures. GAN. Generative Adversarial Networks

ESRGAN generate high quality pictures. GAN. Generative Adversarial Networks """ Super-resolution of CelebA using Generative Adversarial Networks. The

Lieon 1 Dec 14, 2021
Used to record WKU's utility bills on a regular basis.

WKU水电费小助手 一个用于定期记录WKU水电费的脚本 Looking for English Readme? 背景 由于WKU校园内的水电账单系统时常存在扣费延迟的现象,而补扣的费用缺乏令人信服的证明。不少学生为费用摸不着头脑,但也没有申诉的依据。为了更好地掌握水电费使用情况,留下一手证据,我开源

2 Jul 21, 2022
A large-scale video dataset for the training and evaluation of 3D human pose estimation models

ASPset-510 (Australian Sports Pose Dataset) is a large-scale video dataset for the training and evaluation of 3D human pose estimation models. It contains 17 different amateur subjects performing 30

Aiden Nibali 25 Jun 20, 2021
MRQy is a quality assurance and checking tool for quantitative assessment of magnetic resonance imaging (MRI) data.

Front-end View Backend View Table of Contents Description Prerequisites Running Basic Information Measurements User Interface Feedback and usage Descr

Center for Computational Imaging and Personalized Diagnostics 58 Dec 02, 2022
Official implementation of VQ-Diffusion

Vector Quantized Diffusion Model for Text-to-Image Synthesis Overview This is the official repo for the paper: [Vector Quantized Diffusion Model for T

Microsoft 592 Jan 03, 2023
The Environment I built to study Reinforcement Learning + Pokemon Showdown

pokemon-showdown-rl-environment The Environment I built to study Reinforcement Learning + Pokemon Showdown Been a while since I ran this. Think it is

3 Jan 16, 2022
Neural machine translation between the writings of Shakespeare and modern English using TensorFlow

Shakespeare translations using TensorFlow This is an example of using the new Google's TensorFlow library on monolingual translation going from modern

Motoki Wu 245 Dec 28, 2022
Improving Object Detection by Estimating Bounding Box Quality Accurately

Improving Object Detection by Estimating Bounding Box Quality Accurately Abstrac

2 Apr 14, 2022
Updated for TTS(CE) = Also Known as TTN V3. The code requires the first server to be 'ttn' protocol.

Updated Updated for TTS(CE) = Also Known as TTN V3. The code requires the first server to be 'ttn' protocol. Introduction This balenaCloud (previously

Remko 1 Oct 17, 2021
A configurable, tunable, and reproducible library for CTR prediction

FuxiCTR This repo is the community dev version of the official release at huawei-noah/benchmark/FuxiCTR. Click-through rate (CTR) prediction is an cri

XUEPAI 397 Dec 30, 2022
Rotation-Only Bundle Adjustment

ROBA: Rotation-Only Bundle Adjustment Paper, Video, Poster, Presentation, Supplementary Material In this repository, we provide the implementation of

Seong 51 Nov 29, 2022
Official implementation of YOGO for Point-Cloud Processing

You Only Group Once: Efficient Point-Cloud Processing with Token Representation and Relation Inference Module By Chenfeng Xu, Bohan Zhai, Bichen Wu, T

Chenfeng Xu 67 Dec 20, 2022
DiffQ performs differentiable quantization using pseudo quantization noise. It can automatically tune the number of bits used per weight or group of weights, in order to achieve a given trade-off between model size and accuracy.

Differentiable Model Compression via Pseudo Quantization Noise DiffQ performs differentiable quantization using pseudo quantization noise. It can auto

Facebook Research 145 Dec 30, 2022
Efficient and intelligent interactive segmentation annotation software

Efficient and intelligent interactive segmentation annotation software

294 Dec 30, 2022
Flower classification model that classifies flowers in 10 classes made using transfer learning (~85% accuracy).

flower-classification-inceptionV3 Flower classification model that classifies flowers in 10 classes. Training and validation are done using a pre-anot

Ivan R. Mršulja 1 Dec 12, 2021
Official PyTorch Implementation of Learning Self-Similarity in Space and Time as Generalized Motion for Video Action Recognition, ICCV 2021

Official PyTorch Implementation of Learning Self-Similarity in Space and Time as Generalized Motion for Video Action Recognition, ICCV 2021

26 Dec 07, 2022
Source code for TACL paper "KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation".

KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation Source code for TACL 2021 paper KEPLER: A Unified Model for Kn

THU-KEG 138 Dec 22, 2022