Voxel Transformer for 3D object detection

Related tags

Deep LearningVOTR
Overview

Voxel Transformer

This is a reproduced repo of Voxel Transformer for 3D object detection.

The code is mainly based on OpenPCDet.

Introduction

We provide code and training configurations of VoTr-SSD/TSD on the KITTI and Waymo Open dataset. Checkpoints will not be released.

Important Notes: VoTr generally requires quite a long time (more than 60 epochs on Waymo) to converge, and a large GPU memory (32Gb) is needed for reproduction. Please strictly follow the instructions and train with sufficient number of epochs. If you don't have a 32G GPU, you can decrease the attention SIZE parameters in yaml files, but this may possibly harm the performance.

Requirements

The codes are tested in the following environment:

  • Ubuntu 18.04
  • Python 3.6
  • PyTorch 1.5
  • CUDA 10.1
  • OpenPCDet v0.3.0
  • spconv v1.2.1

Installation

a. Clone this repository.

git clone https://github.com/PointsCoder/VOTR.git

b. Install the dependent libraries as follows:

  • Install the dependent python libraries:
pip install -r requirements.txt 
  • Install the SparseConv library, we use the implementation from [spconv].
    • If you use PyTorch 1.1, then make sure you install the spconv v1.0 with (commit 8da6f96) instead of the latest one.
    • If you use PyTorch 1.3+, then you need to install the spconv v1.2. As mentioned by the author of spconv, you need to use their docker if you use PyTorch 1.4+.

c. Compile CUDA operators by running the following command:

python setup.py develop

Training

All the models are trained with Tesla V100 GPUs (32G). The KITTI config of votr_ssd is for training with a single GPU. Other configs are for training with 8 GPUs. If you use different number of GPUs for training, it's necessary to change the respective training epochs to attain a decent performance.

The performance of VoTr is quite unstable on KITTI. If you cannnot reproduce the results, remember to run it multiple times.

  • models
# votr_ssd.yaml: single-stage votr backbone replacing the spconv backbone
# votr_tsd.yaml: two-stage votr with pv-head
  • training votr_ssd on kitti
CUDA_VISIBLE_DEVICES=0 python train.py --cfg_file cfgs/kitti_models/votr_ssd.yaml
  • training other models
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 sh scripts/dist_train.sh 8 --cfg_file cfgs/waymo_models/votr_tsd.yaml
  • testing
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 sh scripts/dist_test.sh 8 --cfg_file cfgs/waymo_models/votr_tsd.yaml --eval_all

Citation

If you find this project useful in your research, please consider cite:

@article{mao2021voxel,
  title={Voxel Transformer for 3D Object Detection},
  author={Mao, Jiageng and Xue, Yujing and Niu, Minzhe and others},
  journal={ICCV},
  year={2021}
}
Owner
I may not respond to issues quickly. Send me an e-mail if necessary.
Code for paper Adaptively Aligned Image Captioning via Adaptive Attention Time

Adaptively Aligned Image Captioning via Adaptive Attention Time This repository includes the implementation for Adaptively Aligned Image Captioning vi

Lun Huang 45 Aug 27, 2022
Official implementation of Deep Reparametrization of Multi-Frame Super-Resolution and Denoising

Deep-Rep-MFIR Official implementation of Deep Reparametrization of Multi-Frame Super-Resolution and Denoising Publication: Deep Reparametrization of M

Goutam Bhat 39 Jan 04, 2023
Robustness between the worst and average case

Robustness between the worst and average case A repository that implements intermediate robustness training and evaluation from the NeurIPS 2021 paper

CMU Locus Lab 16 Dec 02, 2022
[3DV 2021] Channel-Wise Attention-Based Network for Self-Supervised Monocular Depth Estimation

Channel-Wise Attention-Based Network for Self-Supervised Monocular Depth Estimation This is the official implementation for the method described in Ch

Jiaxing Yan 27 Dec 30, 2022
A tool for making map images from OpenTTD save games

OpenTTD Surveyor A tool for making map images from OpenTTD save games. This is not part of the main OpenTTD codebase, nor is it ever intended to be pa

Aidan Randle-Conde 9 Feb 15, 2022
Self-Supervised Image Denoising via Iterative Data Refinement

Self-Supervised Image Denoising via Iterative Data Refinement Yi Zhang1, Dasong Li1, Ka Lung Law2, Xiaogang Wang1, Hongwei Qin2, Hongsheng Li1 1CUHK-S

Zhang Yi 72 Jan 01, 2023
Adaptive FNO transformer - official Pytorch implementation

Adaptive Fourier Neural Operators: Efficient Token Mixers for Transformers This repository contains PyTorch implementation of the Adaptive Fourier Neu

NVIDIA Research Projects 77 Dec 29, 2022
Out-of-Distribution Generalization of Chest X-ray Using Risk Extrapolation

OoD_Gen-Chest_Xray Out-of-Distribution Generalization of Chest X-ray Using Risk Extrapolation Requirements (Installations) Install the following libra

Enoch Tetteh 2 Oct 01, 2022
Tutorial for the PERFECTING FACTORY 5.0 WITH EDGE-POWERED AI workshop

Workshop Advantech Jetson Nano This tutorial has been designed for the PERFECTING FACTORY 5.0 WITH EDGE-POWERED AI workshop in collaboration with Adva

Edge Impulse 18 Nov 22, 2022
FedScale: Benchmarking Model and System Performance of Federated Learning

FedScale: Benchmarking Model and System Performance of Federated Learning (Paper) This repository contains scripts and instructions of building FedSca

268 Jan 01, 2023
Python package to add text to images, textures and different backgrounds

nider Python package for text images generation and watermarking Free software: MIT license Documentation: https://nider.readthedocs.io. nider is an a

Vladyslav Ovchynnykov 131 Dec 30, 2022
A state of the art of new lightweight YOLO model implemented by TensorFlow 2.

CSL-YOLO: A New Lightweight Object Detection System for Edge Computing This project provides a SOTA level lightweight YOLO called "Cross-Stage Lightwe

Miles Zhang 54 Dec 21, 2022
Tensorflow 2 Object Detection API kurulumu, GPU desteği, custom model hazırlama

Tensorflow 2 Object Detection API Bu tutorial, TensorFlow 2.x'in kararlı sürümü olan TensorFlow 2.3'ye yöneliktir. Bu, görüntülerde / videoda nesne a

46 Nov 20, 2022
PyTorch implementation of our paper How robust are discriminatively trained zero-shot learning models?

How robust are discriminatively trained zero-shot learning models? This repository contains the PyTorch implementation of our paper How robust are dis

Mehmet Kerim Yucel 5 Feb 04, 2022
Mip-NeRF: A Multiscale Representation for Anti-Aliasing Neural Radiance Fields.

This repository contains the code release for Mip-NeRF: A Multiscale Representation for Anti-Aliasing Neural Radiance Fields. This implementation is written in JAX, and is a fork of Google's JaxNeRF

Google 625 Dec 30, 2022
Codebase for arXiv preprint "NeRF++: Analyzing and Improving Neural Radiance Fields"

NeRF++ Codebase for arXiv preprint "NeRF++: Analyzing and Improving Neural Radiance Fields" Work with 360 capture of large-scale unbounded scenes. Sup

Kai Zhang 722 Dec 28, 2022
Implementation of ETSformer, state of the art time-series Transformer, in Pytorch

ETSformer - Pytorch Implementation of ETSformer, state of the art time-series Transformer, in Pytorch Install $ pip install etsformer-pytorch Usage im

Phil Wang 121 Dec 30, 2022
BaseCls BaseCls 是一个基于 MegEngine 的预训练模型库,帮助大家挑选或训练出更适合自己科研或者业务的模型结构

BaseCls BaseCls 是一个基于 MegEngine 的预训练模型库,帮助大家挑选或训练出更适合自己科研或者业务的模型结构。 文档地址:https://basecls.readthedocs.io 安装 安装环境 BaseCls 需要 Python = 3.6。 BaseCls 依赖 M

MEGVII Research 28 Dec 23, 2022
PyTorch implementation of "Learn to Dance with AIST++: Music Conditioned 3D Dance Generation."

Learn to Dance with AIST++: Music Conditioned 3D Dance Generation. Installation pip install -r requirements.txt Prepare Dataset bash data/scripts/pre

Zj Li 8 Sep 07, 2021
Neural network chess engine trained on Gary Kasparov's games.

Neural Chess It's not the best chess engine, but it is a chess engine. Proof of concept neural network chess engine (feed-forward multi-layer perceptr

3 Jun 22, 2022