Embracing Single Stride 3D Object Detector with Sparse Transformer

Related tags

Deep LearningSST
Overview

SST: Single-stride Sparse Transformer

This is the official implementation of paper:

Embracing Single Stride 3D Object Detector with Sparse Transformer

Authors: Lue Fan, Ziqi Pang, Tianyuan Zhang, Yu-Xiong Wang, Hang Zhao, Feng Wang, Naiyan Wang, Zhaoxiang Zhang

Paper Link (Check again on Monday)

Introduction and Highlights

  • SST is a single-stride network, which maintains original feature resolution from the beginning to the end of the network. Due to the characterisric of single stride, SST achieves exciting performances on small object detection (Pedestrian, Cyclist).
  • For simplicity, except for backbone, SST is almost the same with the basic PointPillars in MMDetection3D. With such a basic setting, SST achieves state-of-the-art performance in Pedestrian and Cyclist and outperforms PointPillars more than 10 AP only at a cost of 1.5x latency.
  • SST consists of 6 Regional Sparse Attention (SRA) blocks, which deal with the sparse voxel set. It's similar to Submanifold Sparse Convolution (SSC), but much more powerful than SSC. It's locality and sparsity guarantee the efficiency in the single stride setting.
  • The SRA can also be used in many other task to process sparse point clouds. Our implementation of SRA only relies on the pure Python APIs in PyTorch without engineering efforts as taken in the CUDA implementation of sparse convolution.
  • Large room for further improvements. For example, second stage, anchor-free head, IoU scores and advanced techniques from ViT, etc.

Usage

PyTorch >= 1.9 is highly recommended for a better support of the checkpoint technique.

Our immplementation is based on MMDetection3D, so just follow their getting_started and simply run the script: run.sh. Then you will get a basic results of SST after 5~7 hours (depends on your devices).

We only provide the single-stage model here, as for our two-stage models, please follow LiDAR-RCNN. It's also a good choice to apply other powerful second stage detectors to our single-stage SST.

Main results

Single-stage Model (based on PointPillars) on Waymo validation split

#Sweeps Veh_L1 Ped_L1 Cyc_L1
SST_1f 1 73.57 80.01 70.72
SST_3f 3 75.16 83.24 75.96

Note that we train the 3 classes together, so the performance above is a little bit lower than that reported in our paper.

TODO

  • Build SRA block with similar API as Sparse Convolution for more convenient usage.

Acknowlegement

This project is based on the following codebases.

Owner
TuSimple
The Future of Trucking
TuSimple
Bootstrapped Unsupervised Sentence Representation Learning (ACL 2021)

Install first pip3 install -e . Training python3 training/unsupervised_tuning.py python3 training/supervised_tuning.py python3 training/multilingual_

yanzhang_nlp 26 Jul 22, 2022
《LightXML: Transformer with dynamic negative sampling for High-Performance Extreme Multi-label Text Classification》(AAAI 2021) GitHub:

LightXML: Transformer with dynamic negative sampling for High-Performance Extreme Multi-label Text Classification

76 Dec 05, 2022
Single Image Random Dot Stereogram for Tensorflow

TensorFlow-SIRDS Single Image Random Dot Stereogram for Tensorflow SIRDS is a means to present 3D data in a 2D image. It allows for scientific data di

Greg Peatfield 5 Aug 10, 2022
Dense matching library based on PyTorch

Dense Matching A general dense matching library based on PyTorch. For any questions, issues or recommendations, please contact Prune at

Prune Truong 399 Dec 28, 2022
Implements a fake news detection program using classifiers.

Fake news detection Implements a fake news detection program using classifiers for Data Mining course at UoA. Description The project is the categoriz

Apostolos Karvelas 1 Jan 09, 2022
Libraries, tools and tasks created and used at DeepMind Robotics.

Libraries, tools and tasks created and used at DeepMind Robotics.

DeepMind 270 Nov 30, 2022
Practical Single-Image Super-Resolution Using Look-Up Table

Practical Single-Image Super-Resolution Using Look-Up Table [Paper] Dependency Python 3.6 PyTorch glob numpy pillow tqdm tensorboardx 1. Training deep

Younghyun Jo 116 Dec 23, 2022
Tech Resources for Academic Communities

Free tech resources for faculty, students, researchers, life-long learners, and academic community builders for use in tech based courses, workshops, and hackathons.

Microsoft 2.5k Jan 04, 2023
Experiments for distributed optimization algorithms

Network-Distributed Algorithm Experiments -- This repository contains a set of optimization algorithms and objective functions, and all code needed to

Boyue Li 40 Dec 04, 2022
Unsupervised Video Interpolation using Cycle Consistency

Unsupervised Video Interpolation using Cycle Consistency Project | Paper | YouTube Unsupervised Video Interpolation using Cycle Consistency Fitsum A.

NVIDIA Corporation 100 Nov 30, 2022
MDETR: Modulated Detection for End-to-End Multi-Modal Understanding

MDETR: Modulated Detection for End-to-End Multi-Modal Understanding Website • Colab • Paper This repository contains code and links to pre-trained mod

Aishwarya Kamath 770 Dec 28, 2022
GT China coal model

GT China coal model The full version of a China coal transport model with a very high spatial reslution. What it does The code works in a few steps: T

0 Dec 13, 2021
Multiple paper open-source codes of the Microsoft Research Asia DKI group

📫 Paper Code Collection (MSRA DKI Group) This repo hosts multiple open-source codes of the Microsoft Research Asia DKI Group. You could find the corr

Microsoft 249 Jan 08, 2023
Code for Neurips2021 Paper "Topology-Imbalance Learning for Semi-Supervised Node Classification".

Topology-Imbalance Learning for Semi-Supervised Node Classification Introduction Code for NeurIPS 2021 paper "Topology-Imbalance Learning for Semi-Sup

Victor Chen 40 Nov 23, 2022
NeRF visualization library under construction

NeRF visualization library using PlenOctrees, under construction pip install nerfvis Docs will be at: https://nerfvis.readthedocs.org import nerfvis s

Alex Yu 196 Jan 04, 2023
Source code for our EMNLP'21 paper 《Raise a Child in Large Language Model: Towards Effective and Generalizable Fine-tuning》

Child-Tuning Source code for EMNLP 2021 Long paper: Raise a Child in Large Language Model: Towards Effective and Generalizable Fine-tuning. 1. Environ

46 Dec 12, 2022
DeepHawkeye is a library to detect unusual patterns in images using features from pretrained neural networks

English | 简体中文 Introduction DeepHawkeye is a library to detect unusual patterns in images using features from pretrained neural networks Reference Pat

CV Newbie 28 Dec 13, 2022
Official repository for "Intriguing Properties of Vision Transformers" (2021)

Intriguing Properties of Vision Transformers Muzammal Naseer, Kanchana Ranasinghe, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, & Ming-Hsuan Yang P

Muzammal Naseer 155 Dec 27, 2022
Configure SRX interfaces with Scrapli

Configure SRX interfaces with Scrapli Overview This example will show how to configure interfaces on Juniper's SRX firewalls. In addition to the Pytho

Calvin Remsburg 1 Jan 07, 2022