Improving 3D Object Detection with Channel-wise Transformer

Last update: Dec 20, 2022

Related tags

Overview

"Improving 3D Object Detection with Channel-wise Transformer"

Thanks for the OpenPCDet, this implementation of the CT3D is mainly based on the pcdet v0.3. Our paper can be downloaded here ICCV2021.

Overview of CT3D. The raw points are first fed into the RPN for generating 3D proposals. Then the raw points along with the corresponding proposals are processed by the channel-wise Transformer composed of the proposal-to-point encoding module and the channel-wise decoding module. Specifically, the proposal-to-point encoding module is to modulate each point feature with global proposal-aware context information. After that, the encoded point features are transformed into an effective proposal feature representation by the channel-wise decoding module for confidence prediction and box regression.

	[email protected]	[email protected]	Download
Only Car	86.06	85.79	model-car
3-Category (Car)	85.04	84.97	model-3cat
3-Category (Pedestrian)	56.28	55.58	-
3-Category (Cyclist)	71.71	71.88	-

1. Recommended Environment

Linux (tested on Ubuntu 16.04)
Python 3.6+
PyTorch 1.1 or higher (tested on PyTorch 1.6)
CUDA 9.0 or higher (PyTorch 1.3+ needs CUDA 9.2+)

2. Set the Environment

pip install -r requirement.txt
python setup.py develop

3. Data Preparation

Prepare KITTI dataset and road planes

# Download KITTI and organize it into the following form:
├── data
│   ├── kitti
│   │   │── ImageSets
│   │   │── training
│   │   │   ├──calib & velodyne & label_2 & image_2 & (optional: planes)
│   │   │── testing
│   │   │   ├──calib & velodyne & image_2

# Generatedata infos:
python -m pcdet.datasets.kitti.kitti_dataset create_kitti_infos tools/cfgs/dataset_configs/kitti_dataset.yaml

Prepare Waymo dataset

# Download Waymo and organize it into the following form:
├── data
│   ├── waymo
│   │   │── ImageSets
│   │   │── raw_data
│   │   │   │── segment-xxxxxxxx.tfrecord
|   |   |   |── ...
|   |   |── waymo_processed_data
│   │   │   │── segment-xxxxxxxx/
|   |   |   |── ...
│   │   │── pcdet_gt_database_train_sampled_xx/
│   │   │── pcdet_waymo_dbinfos_train_sampled_xx.pkl

# Install tf 2.1.0
# Install the official waymo-open-dataset by running the following command:
pip3 install --upgrade pip
pip3 install waymo-open-dataset-tf-2-1-0 --user

# Extract point cloud data from tfrecord and generate data infos:
python -m pcdet.datasets.waymo.waymo_dataset --func create_waymo_infos --cfg_file tools/cfgs/dataset_configs/waymo_dataset.yaml

4. Train

Train with a single GPU

python train.py --cfg_file ${CONFIG_FILE}

# e.g.,
python train.py --cfg_file tools/cfgs/kitti_models/second_ct3d.yaml

Train with multiple GPUs or multiple machines

bash scripts/dist_train.sh ${NUM_GPUS} --cfg_file ${CONFIG_FILE}
# or 
bash scripts/slurm_train.sh ${PARTITION} ${JOB_NAME} ${NUM_GPUS} --cfg_file ${CONFIG_FILE}

# e.g.,
bash scripts/dist_train.sh 8 --cfg_file tools/cfgs/kitti_models/second_ct3d.yaml

5. Test

Test with a pretrained model:

python test.py --cfg_file ${CONFIG_FILE} --ckpt ${CKPT}

# e.g., 
python test.py --cfg_file tools/cfgs/kitti_models/second_ct3d.yaml --ckpt output/kitti_models/second_ct3d/default/kitti_val.pth

Improving 3D Object Detection with Channel-wise Transformer

Related tags

Overview

"Improving 3D Object Detection with Channel-wise Transformer"

1. Recommended Environment

2. Set the Environment

3. Data Preparation

4. Train

5. Test

Owner

Hualian Sheng

Implementation of OmniNet, Omnidirectional Representations from Transformers, in Pytorch

Mining-the-Social-Web-3rd-Edition - The official online compendium for Mining the Social Web, 3rd Edition (O'Reilly, 2018)

Repo for our ICML21 paper Unsupervised Learning of Visual 3D Keypoints for Control

This project aims to segment 4 common retinal lesions from Fundus Images.

Unofficial PyTorch Implementation of Multi-Singer

CBKH: The Cornell Biomedical Knowledge Hub

A curated list of neural rendering resources.

[Official] Exploring Temporal Coherence for More General Video Face Forgery Detection(ICCV 2021)

The code of Zero-shot learning for low-light image enhancement based on dual iteration

A bare-bones Python library for quality diversity optimization.

Adaptive FNO transformer - official Pytorch implementation

A state-of-the-art semi-supervised method for image recognition

Official implementation of VQ-Diffusion

Everything you need to know about NumPy( Creating Arrays, Indexing, Math,Statistics,Reshaping).

Data and analysis code for an MS on SK VOC genomes phenotyping/neutralisation assays

A simple root calculater for python

Using Python to Play Cyberpunk 2077

A tiny, pedagogical neural network library with a pytorch-like API.

Implementation of the paper Recurrent Glimpse-based Decoder for Detection with Transformer.

A Python type explainer!