RTS3D: Real-time Stereo 3D Detection from 4D Feature-Consistency Embedding Space for Autonomous Driving

Last update: Nov 29, 2022

Related tags

Deep Learning RTS3D

Overview

RTS3D: Real-time Stereo 3D Detection from 4D Feature-Consistency Embedding Space for Autonomous Driving (AAAI2021).

RTS3D is efficiency and accuracy stereo 3D object detection method for autonomous driving.

RTS3D

Introduction

RTS3D is the first true real-time system (FPS>24) for stereo image 3D detection meanwhile achieves 10% improvement in average precision comparing with the previous state-of-the-art method. RTS3D only require RGB images without synthetic data, instance segmentation, CAD model, or depth generator.

Highlights

Fast: 33 FPS of single image test speed in KITTI benchmark with 384*1280 resolution
Accuracy: SOTA on the KITTI benchmark.
Anchor Free: No 2D or 3D anchor are reauired
Easy to deploy: RTS3D uses conventional convolution operations and MLP, so it is very easy to deploy and accelerate.

RTS3D Baseline and Model Zoo

All experiments are tested with Ubuntu 16.04, Pytorch 1.0.0, CUDA 9.0, Python 3.6, single NVIDIA 2080Ti

IoU Setting 1: Car IoU > 0.5, Pedestrian IoU > 0.25, Cyclist IoU > 0.25

IoU Setting 2: Car IoU > 0.7, Pedestrian IoU > 0.5, Cyclist IoU > 0.5

Training on KITTI train split and evaluation on val split.
- FCE Space Resolution: 10 * 10 * 10
- Model: (Google Drive), (Baidu Cloud 提取码：k4uk)

Class	Iteration	FPS	AP BEV IoU Setting1	AP 3D IoU Setting1	AP BEV IoU Setting2	AP 3D IoU Setting2
-	-	-	Easy / Moderate / Hard	Easy / Moderate / Hard	Easy / Moderate / Hard	Easy / Moderate / Hard
Car- Recall-11	1	90.9	89.83, 77.05, 68.28	89.27, 70.12, 61.17	73.20, 53.62, 46.44	60.87, 42.38, 36.44
Car- Recall-40	1	90.9	92.92, 76.17, 66.62	90.35, 71.37, 63.52	78.12, 54.75, 47.09	60.34, 39.32, 32.97
Car- Recall-11	2	45.5	90.41, 78.70, 70.03	90.26, 77.23, 68.28	76.56, 56.46, 48.20	63.65, 44.50, 37.48
Car- Recall-40	2	45.5	95.75, 79.61, 69.69	93.57, 76.64, 66.72	78.12, 54.75, 47.09	63.99, 41.78, 34.96

Training on KITTI train split and evaluation on val split.
- FCE Space Resolution: 10 * 10 * 10
- Recall split: 11
- Iteration: 2
- Model: (Google Drive), (Baidu Cloud 提取码：4t4u)

Class	AP BEV IoU Setting1	AP 3D IoU Setting1	AP BEV IoU Setting2	AP 3D IoU Setting2
-	Easy / Moderate / Hard	Easy / Moderate / Hard	Easy / Moderate / Hard	Easy / Moderate / Hard
Car	90.18, 78.46, 69.76	89.88, 76.64, 67.86	74.95, 54.07, 46.78	58.50, 39.74, 34.83
Pedestrian	57.12, 48.82, 40.88	56.36, 48.29, 40.22	32.16, 26.31, 21.28	26.95, 20.77, 19.74
Cyclist	54.48, 35.78, 30.80	53.86, 30.90, 30.52	33.59, 20.80, 20.14	31.05, 20.26, 18.93

Installation

Please refer to INSTALL.md

Dataset preparation

Please download the official KITTI 3D object detection dataset and organize the downloaded files as follows:

KM3DNet
├── kitti_format
│   ├── data
│   │   ├── kitti
│   │   |   ├── annotations
│   │   │   ├── calib /000000.txt .....
│   │   │   ├── image(left[0-7480] right[7481-14961] input augmentatiom)
│   │   │   ├── label /000000.txt .....
|   |   |   ├── train.txt val.txt trainval.txt
│   │   │   ├── mono_results /000000.txt .....
├── src
├── demo_kitti_format
├── readme
├── requirements.txt

Getting Started

Please refer to GETTING_STARTED.md to learn more usage about this project.

Acknowledgement

License

RTS3D is released under the MIT License (refer to the LICENSE file for details). Portions of the code are borrowed from, CenterNet, iou3d and kitti_eval (KITTI dataset evaluation). Please refer to the original License of these projects (See NOTICE).

Citation

If you find this project useful for your research, please use the following BibTeX entry.

@misc{2012.15072,
Author = {Peixuan Li, Shun Su, Huaici Zhao},
Title = {RTS3D: Real-time Stereo 3D Detection from 4D Feature-Consistency Embedding Space for Autonomous Driving},
Year = {2020},
Eprint = {arXiv:2012.15072},
}

RTS3D: Real-time Stereo 3D Detection from 4D Feature-Consistency Embedding Space for Autonomous Driving

Related tags

Overview

RTS3D: Real-time Stereo 3D Detection from 4D Feature-Consistency Embedding Space for Autonomous Driving (AAAI2021).

Introduction

Highlights

RTS3D Baseline and Model Zoo

Installation

Dataset preparation

Getting Started

Acknowledgement

License

Citation

Owner

PyTorch Personal Trainer: My framework for deep learning experiments

A short and easy PyTorch implementation of E(n) Equivariant Graph Neural Networks

Pytorch implementation for reproducing StackGAN_v2 results in the paper StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks

[ICCV 2021] HRegNet: A Hierarchical Network for Large-scale Outdoor LiDAR Point Cloud Registration

⚖️🔁🔮🕵️‍♂️🦹🖼️ Code for Measuring the Contribution of Multiple Model Representations in Detecting Adversarial Instances paper.

Official codes for the paper "Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech"

Harmonic Memory Networks for Graph Completion

A Multi-attribute Controllable Generative Model for Histopathology Image Synthesis

Minimal implementation and experiments of "No-Transaction Band Network: A Neural Network Architecture for Efficient Deep Hedging".

Neural Cellular Automata + CLIP

Election Exit Poll Prediction and U.S.A Presidential Speech Analysis using Machine Learning

DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification

Official Pytorch and JAX implementation of "Efficient-VDVAE: Less is more"

University of Rochester 2021 Summer REU focusing on music sentiment transfer using CycleGAN

A PyTorch version of You Only Look at One-level Feature object detector

An experiment on the performance of homemade Q-learning AIs in Agar.io depending on their state representation and available actions

The CLRS Algorithmic Reasoning Benchmark

Lingvo is a framework for building neural networks in Tensorflow, particularly sequence models.

Image Segmentation Animation using Quadtree concepts.

RTS3D: Real-time Stereo 3D Detection from 4D Feature-Consistency Embedding Space for Autonomous Driving

Related tags

Overview

RTS3D: Real-time Stereo 3D Detection from 4D Feature-Consistency Embedding Space for Autonomous Driving (AAAI2021).

Introduction

Highlights

RTS3D Baseline and Model Zoo

Installation

Dataset preparation

Getting Started

Acknowledgement

License

Citation

Owner

PyTorch Personal Trainer: My framework for deep learning experiments

A short and easy PyTorch implementation of E(n) Equivariant Graph Neural Networks

Pytorch implementation for reproducing StackGAN_v2 results in the paper StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks

[ICCV 2021] HRegNet: A Hierarchical Network for Large-scale Outdoor LiDAR Point Cloud Registration

⚖️🔁🔮🕵️‍♂️🦹🖼️ Code for *Measuring the Contribution of Multiple Model Representations in Detecting Adversarial Instances* paper.

Official codes for the paper "Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech"

Harmonic Memory Networks for Graph Completion

A Multi-attribute Controllable Generative Model for Histopathology Image Synthesis

Minimal implementation and experiments of "No-Transaction Band Network: A Neural Network Architecture for Efficient Deep Hedging".

Neural Cellular Automata + CLIP

Election Exit Poll Prediction and U.S.A Presidential Speech Analysis using Machine Learning

DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification

Official Pytorch and JAX implementation of "Efficient-VDVAE: Less is more"

University of Rochester 2021 Summer REU focusing on music sentiment transfer using CycleGAN

A PyTorch version of You Only Look at One-level Feature object detector

An experiment on the performance of homemade Q-learning AIs in Agar.io depending on their state representation and available actions

The CLRS Algorithmic Reasoning Benchmark

Lingvo is a framework for building neural networks in Tensorflow, particularly sequence models.

Image Segmentation Animation using Quadtree concepts.

⚖️🔁🔮🕵️‍♂️🦹🖼️ Code for Measuring the Contribution of Multiple Model Representations in Detecting Adversarial Instances paper.