Structure Information is the Key: Self-Attention RoI Feature Extractor in 3D Object Detection

Last update: Oct 07, 2022

Related tags

Overview

Structure Information is the Key: Self-Attention RoI Feature Extractor in 3D Object Detection

abstract:Unlike 2D object detection where all RoI features come from grid pixels, the RoI feature extraction of 3D point cloud object detection is more diverse. In this paper, we first compare and analyze the differences in structure and performance between the two state-of-the-art models PV-RCNN and Voxel-RCNN. Then, we find that the performance gap between the two models does not come from point information, but structural information. The voxel features contain more structural information because they do quantization instead of downsampling to point cloud so that they can contain basically the complete information of the whole point cloud. The stronger structural information in voxel features makes the detector have higher performance in our experiments even if the voxel features don't have accurate location information. Then, we propose that structural information is the key to 3D object detection. Based on the above conclusion, we propose a Self-Attention RoI Feature Extractor (SARFE) to enhance structural information of the feature extracted from 3D proposals. SARFE is a plug-and-play module that can be easily used on existing 3D detectors. Our SARFE is evaluated on both KITTI dataset and Waymo Open dataset. With the newly introduced SARFE, we improve the performance of the state-of-the-art 3D detectors by a large margin in \textit{cyclist} on KITTI dataset while keeping real-time capability.

The source code will be published after the paper has been accepted to a conference.

Full paper

AP on KITTI Dataset

Submission link

AP on Waymo Open Dataset

Submission link

License

This code is released under the Apache 2.0 license.

Acknowledge

Our code are mainly based on OpenPCDet, thanks for their contributions!

Structure Information is the Key: Self-Attention RoI Feature Extractor in 3D Object Detection

Related tags

Overview

Structure Information is the Key: Self-Attention RoI Feature Extractor in 3D Object Detection

AP on KITTI Dataset

AP on Waymo Open Dataset

License

Acknowledge

Owner

DK. Zhang

Code for HodgeNet: Learning Spectral Geometry on Triangle Meshes, in SIGGRAPH 2021.

A python module for scientific analysis of 3D objects based on VTK and Numpy

3D-printable hand-strapped keyboard

Applications using the GTN library and code to reproduce experiments in "Differentiable Weighted Finite-State Transducers"

Conversational text Analysis using various NLP techniques

Python Tensorflow 2 scripts for detecting objects of any class in an image without knowing their label.

A Bayesian cognition approach for belief updating of correlation judgement through uncertainty visualizations

1st Solution For ICDAR 2021 Competition on Mathematical Formula Detection

SwinIR: Image Restoration Using Swin Transformer

[ICML'21] Estimate the accuracy of the classifier in various environments through self-supervision

A neuroanatomy-based augmented reality experience powered by computer vision. Features 3D visuals of the Atlas Brain Map slices.

Alpha-IoU: A Family of Power Intersection over Union Losses for Bounding Box Regression

A standard framework for modelling Deep Learning Models for tabular data

Learning to Estimate Hidden Motions with Global Motion Aggregation

This GitHub repo consists of Code and Some results of project- Diabetes Treatment using Gold nanoparticles. These Consist of ML Models used for prediction Diabetes and further the basic theory and working of Gold nanoparticles.

Learning to Reconstruct 3D Non-Cuboid Room Layout from a Single RGB Image

Housing Price Prediction

Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners

Implementation of the paper "Fine-Tuning Transformers: Vocabulary Transfer"

Implementation of experiments in the paper Clockwork Variational Autoencoders (project website) using JAX and Flax