PointPillars inference with TensorRT

Last update: Dec 31, 2022

Related tags

Overview

PointPillars inference with TensorRT

This repository contains sources and model for pointpillars inference using TensorRT. The model is created by OpenPCDet and modified by onnx_graphsurgeon.

Inference has four parts: generateVoxels: convert points cloud into voxels which has 4 channles generateFeatures: convert voxels into feature maps which has 10 channles Inference: convert feature maps to raw data of bounding box, class source and direction Postprocessing: parse bounding box, class source and direction

Data

The demo use the data from KITTI Dataset and more data can be downloaded following the linker GETTING_STARTED

Model

The onnx file can be converted from a model trainned by OpenPCDet with the tool in the demo.

Build

Prerequisites

To build the pointpillars inference, TensorRT with PillarScatter layer and CUDA are needed. PillarScatter layer plugin is already implemented as a plugin for TRT in the demo.

Jetpack 4.5
TensorRT v7.1.3
CUDA-10.2 + cuDNN-8.0.0
PCL is optinal to store pcd pointcloud file

Compile

$ cd test
$ mkdir build
$ cd build
$ make -j$(nproc)

Run

$ ./demo

Enviroments

Jetpack 4.5
Cuda10.2 + cuDNN8.0.0 + TensorRT 7.1.3
Nvidia Jetson AGX Xavier

Performance

FP16

|                   | GPU/ms | 
| ----------------- | ------ |
| generateVoxels    | 0.22   |
| generateFeatures  | 0.21   |
| Inference         | 30.75  |
| Postprocessing    | 3.19   |

Note

GPU processes all points at the same time and points selected form points cloud for a voxel randomly, so the output of generateVoxels has random value. Because CPU will select the first 32 points, the output of generateVoxels by CPU has fixed value.
The demo will cache the onnx file to improve performance. If a new onnx will be used, please remove the cache file in "./model"
MAX_VOXELS in params.h is used to allocate cache during inference. Decrease the value to save memory.

PointPillars inference with TensorRT

Related tags

Overview

PointPillars inference with TensorRT

Data

Model

Build

Prerequisites

Compile

Run

Enviroments

Performance

Note

References

Owner

NVIDIA AI IOT

Parameter Efficient Deep Probabilistic Forecasting

Implementation of 'lightweight' GAN, proposed in ICLR 2021, in Pytorch. High resolution image generations that can be trained within a day or two

TorchMD-Net provides state-of-the-art graph neural networks and equivariant transformer neural networks potentials for learning molecular potentials

An expansion for RDKit to read all types of files in one line

Author's PyTorch implementation of TD3+BC, a simple variant of TD3 for offline RL

Code for ICCV 2021 paper "Distilling Holistic Knowledge with Graph Neural Networks"

Semantic-aware Grad-GAN for Virtual-to-Real Urban Scene Adaption

official implemntation for "Contrastive Learning with Stronger Augmentations"

GLANet - The code for Global and Local Alignment Networks for Unpaired Image-to-Image Translation arxiv

Pyeventbus: a publish/subscribe event bus

Ejemplo Algoritmo Viterbi - Example of a Viterbi algorithm applied to a hidden Markov model on DNA sequence

TensorFlow implementation of Deep Reinforcement Learning papers

Assginment for UofT CSC420: Intro to Image Understanding

Algo-burn - Script to configure an Algorand address as a "burn" address for one or more ASA tokens

PSTR: End-to-End One-Step Person Search With Transformers (CVPR2022)

SIR model parameter estimation using a novel algorithm for differentiated uniformization.

🤖 A Python library for learning and evaluating knowledge graph embeddings

Codes for AAAI22 paper "Learning to Solve Travelling Salesman Problem with Hardness-Adaptive Curriculum"

💡 Learnergy is a Python library for energy-based machine learning models.

My implementation of Fully Convolutional Neural Networks in Keras