A general python framework for single object tracking in LiDAR point clouds, based on PyTorch Lightning.

Overview

Open3DSOT

A general python framework for single object tracking in LiDAR point clouds, based on PyTorch Lightning.

The official code release of BAT and MM Track.

Features

  • Modular design. It is easy to config the model and training/testing behaviors through just a .yaml file.
  • DDP support for both training and testing.
  • Support all common tracking datasets (KITTI, NuScenes, Waymo Open Dataset).

📣 One tracking paper is accepted by CVPR2022 (Oral)! 👇

Trackers

This repository includes the implementation of the following models:

MM-Track (CVPR2022 Oral)

[Paper] [Project Page]

MM-Track is the first motion-centric tracker in LiDAR SOT, which robustly handles distractors and drastic appearance changes in complex driving scenes. Unlike previous methods, MM-Track is a matching-free two-stage tracker which localizes the targets by explicitly modeling the "relative target motion" among frames.

BAT (ICCV2021)

[Paper] [Results]

Official implementation of BAT. BAT uses the BBox information to compensate the information loss of incomplete scans. It augments the target template with box-aware features that efficiently and effectively improve appearance matching.

P2B (CVPR2020)

[Paper] [Official implementation]

Third party implementation of P2B. Our implementation achieves better results than the official code release. P2B adapts SiamRPN to 3D point clouds by integrating a pointwise correlation operator with a point-based RPN (VoteNet).

Setup

Installation

  • Create the environment

    git clone https://github.com/Ghostish/Open3DSOT.git
    cd Open3DSOT
    conda create -n Open3DSOT  python=3.6
    conda activate Open3DSOT
    
  • Install pytorch

    conda install pytorch==1.4.0 torchvision==0.5.0 cudatoolkit=10.1 -c pytorch
    

    Our code is well tested with pytorch 1.4.0 and CUDA 10.1. But other platforms may also work. Follow this to install another version of pytorch. Note: In order to reproduce the reported results with the provided checkpoints, please use CUDA 10.x.

  • Install other dependencies:

    pip install -r requirement.txt
    

    Install the nuscenes-devkit if you use want to use NuScenes dataset:

    pip install nuscenes-devkit
    

KITTI dataset

  • Download the data for velodyne, calib and label_02 from KITTI Tracking.
  • Unzip the downloaded files.
  • Put the unzipped files under the same folder as following.
    [Parent Folder]
    --> [calib]
        --> {0000-0020}.txt
    --> [label_02]
        --> {0000-0020}.txt
    --> [velodyne]
        --> [0000-0020] folders with velodynes .bin files
    

NuScenes dataset

  • Download the dataset from the download page
  • Extract the downloaded files and make sure you have the following structure:
    [Parent Folder]
      samples	-	Sensor data for keyframes.
      sweeps	-	Sensor data for intermediate frames.
      maps	        -	Folder for all map files: rasterized .png images and vectorized .json files.
      v1.0-*	-	JSON tables that include all the meta data and annotations. Each split (trainval, test, mini) is provided in a separate folder.
    

Note: We use the train_track split to train our model and test it with the val split. Both splits are officially provided by NuScenes. During testing, we ignore the sequences where there is no point in the first given bbox.

Waymo dataset

  • Download and prepare dataset by the instruction of CenterPoint.
    [Parent Folder]
      tfrecord_training	                    
      tfrecord_validation	                 
      train 	                                    -	all training frames and annotations 
      val   	                                    -	all validation frames and annotations 
      infos_train_01sweeps_filter_zero_gt.pkl
      infos_val_01sweeps_filter_zero_gt.pkl
    
  • Prepare SOT dataset. Data from specific category and split will be merged (e.g., sot_infos_vehicle_train.pkl).
  python datasets/generate_waymo_sot.py

Quick Start

Training

To train a model, you must specify the .yaml file with --cfg argument. The .yaml file contains all the configurations of the dataset and the model. Currently, we provide four .yaml files under the cfgs directory. Note: Before running the code, you will need to edit the .yaml file by setting the path argument as the correct root of the dataset.

python main.py --gpu 0 1 --cfg cfgs/BAT_Car.yaml  --batch_size 50 --epoch 60 --preloading

After you start training, you can start Tensorboard to monitor the training process:

tensorboard --logdir=./ --port=6006

By default, the trainer runs a full evaluation on the full test split after training every epoch. You can set --check_val_every_n_epoch to a larger number to speed up the training. The --preloading flag is used to preload the training samples into the memory to save traning time. Remove this flag if you don't have enough memory.

Testing

To test a trained model, specify the checkpoint location with --checkpoint argument and send the --test flag to the command.

python main.py --gpu 0 1 --cfg cfgs/BAT_Car.yaml  --checkpoint /path/to/checkpoint/xxx.ckpt --test

Reproduction

Model Category Success Precision Checkpoint
BAT-KITTI Car 65.37 78.88 pretrained_models/bat_kitti_car.ckpt
BAT-NuScenes Car 40.73 43.29 pretrained_models/bat_nuscenes_car.ckpt
BAT-KITTI Pedestrian 45.74 74.53 pretrained_models/bat_kitti_pedestrian.ckpt

Three trained BAT models for KITTI and NuScenes datasets are provided in the pretrained_models directory. To reproduce the results, simply run the code with the corresponding .yaml file and checkpoint. For example, to reproduce the tracking results on KITTI Car, just run:

python main.py --gpu 0 1 --cfg cfgs/BAT_Car.yaml  --checkpoint ./pretrained_models/bat_kitti_car.ckpt --test

Acknowledgment

  • This repo is built upon P2B and SC3D.
  • Thank Erik Wijmans for his pytorch implementation of PointNet++

License

This repository is released under MIT License (see LICENSE file for details).

Owner
Kangel Zenn
Ph.D. Student in CUHKSZ.
Kangel Zenn
Vision Transformer for 3D medical image registration (Pytorch).

ViT-V-Net: Vision Transformer for Volumetric Medical Image Registration keywords: vision transformer, convolutional neural networks, image registratio

Junyu Chen 192 Dec 20, 2022
An API-first distributed deployment system of deep learning models using timeseries data to analyze and predict systems behaviour

Gordo Building thousands of models with timeseries data to monitor systems. Table of content About Examples Install Uninstall Developer manual How to

Equinor 26 Dec 27, 2022
This repository contains PyTorch code for Robust Vision Transformers.

This repository contains PyTorch code for Robust Vision Transformers.

117 Dec 07, 2022
Generate indoor scenes with Transformers

SceneFormer: Indoor Scene Generation with Transformers Initial code release for the Sceneformer paper, contains models, train and test scripts for the

Chandan Yeshwanth 110 Dec 06, 2022
Creating multimodal multitask models

Fusion Brain Challenge The English version of the document can be found here. Обновления 01.11 Мы выкладываем пример данных, аналогичных private test

Sber AI 43 Nov 28, 2022
public repo for ESTER dataset and modeling (EMNLP'21)

Project / Paper Introduction This is the project repo for our EMNLP'21 paper: https://arxiv.org/abs/2104.08350 Here, we provide brief descriptions of

PlusLab 19 Oct 27, 2022
[WACV 2020] Reducing Footskate in Human Motion Reconstruction with Ground Contact Constraints

Reducing Footskate in Human Motion Reconstruction with Ground Contact Constraints Official implementation for Reducing Footskate in Human Motion Recon

Virginia Tech Vision and Learning Lab 38 Nov 01, 2022
Densely Connected Search Space for More Flexible Neural Architecture Search (CVPR2020)

DenseNAS The code of the CVPR2020 paper Densely Connected Search Space for More Flexible Neural Architecture Search. Neural architecture search (NAS)

Jamin Fong 291 Nov 18, 2022
Baseline model for "GraspNet-1Billion: A Large-Scale Benchmark for General Object Grasping" (CVPR 2020)

GraspNet Baseline Baseline model for "GraspNet-1Billion: A Large-Scale Benchmark for General Object Grasping" (CVPR 2020). [paper] [dataset] [API] [do

GraspNet 209 Dec 29, 2022
EPSANet:An Efficient Pyramid Split Attention Block on Convolutional Neural Network

EPSANet:An Efficient Pyramid Split Attention Block on Convolutional Neural Network This repo contains the official Pytorch implementaion code and conf

Hu Zhang 175 Jan 07, 2023
Reinforcement learning library(framework) designed for PyTorch, implements DQN, DDPG, A2C, PPO, SAC, MADDPG, A3C, APEX, IMPALA ...

Automatic, Readable, Reusable, Extendable Machin is a reinforcement library designed for pytorch. Build status Platform Status Linux Windows Supported

Iffi 348 Dec 24, 2022
The implementation of FOLD-R++ algorithm

FOLD-R-PP The implementation of FOLD-R++ algorithm. The target of FOLD-R++ algorithm is to learn an answer set program for a classification task. Inst

13 Dec 23, 2022
Implementation for Shape from Polarization for Complex Scenes in the Wild

sfp-wild Implementation for Shape from Polarization for Complex Scenes in the Wild project website | paper Code and dataset will be released soon. Int

Chenyang LEI 41 Dec 23, 2022
DC3: A Learning Method for Optimization with Hard Constraints

DC3: A learning method for optimization with hard constraints This repository is by Priya L. Donti, David Rolnick, and J. Zico Kolter and contains the

CMU Locus Lab 57 Dec 26, 2022
A High-Quality Real Time Upscaler for Anime Video

Anime4K Anime4K is a set of open-source, high-quality real-time anime upscaling/denoising algorithms that can be implemented in any programming langua

15.7k Jan 06, 2023
TorchMultimodal is a PyTorch library for training state-of-the-art multimodal multi-task models at scale.

TorchMultimodal (Alpha Release) Introduction TorchMultimodal is a PyTorch library for training state-of-the-art multimodal multi-task models at scale.

Meta Research 663 Jan 06, 2023
Deep Learning applied to Integral data analysis

DeepIntegralCompton Deep Learning applied to Integral data analysis Module installation Move to the root directory of the project and execute : pip in

Thomas Vuillaume 1 Dec 10, 2021
The code for paper "Contrastive Spatio-Temporal Pretext Learning for Self-supervised Video Representation" which is accepted by AAAI 2022

Contrastive Spatio Temporal Pretext Learning for Self-supervised Video Representation (AAAI 2022) The code for paper "Contrastive Spatio-Temporal Pret

8 Jun 30, 2022
Pytorch implementation of face attention network

Face Attention Network Pytorch implementation of face attention network as described in Face Attention Network: An Effective Face Detector for the Occ

Hooks 312 Dec 09, 2022
[ICLR 2022] Contact Points Discovery for Soft-Body Manipulations with Differentiable Physics

CPDeform Code and data for paper Contact Points Discovery for Soft-Body Manipulations with Differentiable Physics at ICLR 2022 (Spotlight). @InProceed

(Lester) Sizhe Li 29 Nov 29, 2022