A general python framework for single object tracking in LiDAR point clouds, based on PyTorch Lightning.

Overview

Open3DSOT

A general python framework for single object tracking in LiDAR point clouds, based on PyTorch Lightning.

The official code release of BAT and MM Track.

Features

  • Modular design. It is easy to config the model and training/testing behaviors through just a .yaml file.
  • DDP support for both training and testing.
  • Support all common tracking datasets (KITTI, NuScenes, Waymo Open Dataset).

📣 One tracking paper is accepted by CVPR2022 (Oral)! 👇

Trackers

This repository includes the implementation of the following models:

MM-Track (CVPR2022 Oral)

[Paper] [Project Page]

MM-Track is the first motion-centric tracker in LiDAR SOT, which robustly handles distractors and drastic appearance changes in complex driving scenes. Unlike previous methods, MM-Track is a matching-free two-stage tracker which localizes the targets by explicitly modeling the "relative target motion" among frames.

BAT (ICCV2021)

[Paper] [Results]

Official implementation of BAT. BAT uses the BBox information to compensate the information loss of incomplete scans. It augments the target template with box-aware features that efficiently and effectively improve appearance matching.

P2B (CVPR2020)

[Paper] [Official implementation]

Third party implementation of P2B. Our implementation achieves better results than the official code release. P2B adapts SiamRPN to 3D point clouds by integrating a pointwise correlation operator with a point-based RPN (VoteNet).

Setup

Installation

  • Create the environment

    git clone https://github.com/Ghostish/Open3DSOT.git
    cd Open3DSOT
    conda create -n Open3DSOT  python=3.6
    conda activate Open3DSOT
    
  • Install pytorch

    conda install pytorch==1.4.0 torchvision==0.5.0 cudatoolkit=10.1 -c pytorch
    

    Our code is well tested with pytorch 1.4.0 and CUDA 10.1. But other platforms may also work. Follow this to install another version of pytorch. Note: In order to reproduce the reported results with the provided checkpoints, please use CUDA 10.x.

  • Install other dependencies:

    pip install -r requirement.txt
    

    Install the nuscenes-devkit if you use want to use NuScenes dataset:

    pip install nuscenes-devkit
    

KITTI dataset

  • Download the data for velodyne, calib and label_02 from KITTI Tracking.
  • Unzip the downloaded files.
  • Put the unzipped files under the same folder as following.
    [Parent Folder]
    --> [calib]
        --> {0000-0020}.txt
    --> [label_02]
        --> {0000-0020}.txt
    --> [velodyne]
        --> [0000-0020] folders with velodynes .bin files
    

NuScenes dataset

  • Download the dataset from the download page
  • Extract the downloaded files and make sure you have the following structure:
    [Parent Folder]
      samples	-	Sensor data for keyframes.
      sweeps	-	Sensor data for intermediate frames.
      maps	        -	Folder for all map files: rasterized .png images and vectorized .json files.
      v1.0-*	-	JSON tables that include all the meta data and annotations. Each split (trainval, test, mini) is provided in a separate folder.
    

Note: We use the train_track split to train our model and test it with the val split. Both splits are officially provided by NuScenes. During testing, we ignore the sequences where there is no point in the first given bbox.

Waymo dataset

  • Download and prepare dataset by the instruction of CenterPoint.
    [Parent Folder]
      tfrecord_training	                    
      tfrecord_validation	                 
      train 	                                    -	all training frames and annotations 
      val   	                                    -	all validation frames and annotations 
      infos_train_01sweeps_filter_zero_gt.pkl
      infos_val_01sweeps_filter_zero_gt.pkl
    
  • Prepare SOT dataset. Data from specific category and split will be merged (e.g., sot_infos_vehicle_train.pkl).
  python datasets/generate_waymo_sot.py

Quick Start

Training

To train a model, you must specify the .yaml file with --cfg argument. The .yaml file contains all the configurations of the dataset and the model. Currently, we provide four .yaml files under the cfgs directory. Note: Before running the code, you will need to edit the .yaml file by setting the path argument as the correct root of the dataset.

python main.py --gpu 0 1 --cfg cfgs/BAT_Car.yaml  --batch_size 50 --epoch 60 --preloading

After you start training, you can start Tensorboard to monitor the training process:

tensorboard --logdir=./ --port=6006

By default, the trainer runs a full evaluation on the full test split after training every epoch. You can set --check_val_every_n_epoch to a larger number to speed up the training. The --preloading flag is used to preload the training samples into the memory to save traning time. Remove this flag if you don't have enough memory.

Testing

To test a trained model, specify the checkpoint location with --checkpoint argument and send the --test flag to the command.

python main.py --gpu 0 1 --cfg cfgs/BAT_Car.yaml  --checkpoint /path/to/checkpoint/xxx.ckpt --test

Reproduction

Model Category Success Precision Checkpoint
BAT-KITTI Car 65.37 78.88 pretrained_models/bat_kitti_car.ckpt
BAT-NuScenes Car 40.73 43.29 pretrained_models/bat_nuscenes_car.ckpt
BAT-KITTI Pedestrian 45.74 74.53 pretrained_models/bat_kitti_pedestrian.ckpt

Three trained BAT models for KITTI and NuScenes datasets are provided in the pretrained_models directory. To reproduce the results, simply run the code with the corresponding .yaml file and checkpoint. For example, to reproduce the tracking results on KITTI Car, just run:

python main.py --gpu 0 1 --cfg cfgs/BAT_Car.yaml  --checkpoint ./pretrained_models/bat_kitti_car.ckpt --test

Acknowledgment

  • This repo is built upon P2B and SC3D.
  • Thank Erik Wijmans for his pytorch implementation of PointNet++

License

This repository is released under MIT License (see LICENSE file for details).

Owner
Kangel Zenn
Ph.D. Student in CUHKSZ.
Kangel Zenn
Neuron class provides LNU (Linear Neural Unit), QNU (Quadratic Neural Unit), RBF (Radial Basis Function), MLP (Multi Layer Perceptron), MLP-ELM (Multi Layer Perceptron - Extreme Learning Machine) neurons learned with Gradient descent or LeLevenberg–Marquardt algorithm

Neuron class provides LNU (Linear Neural Unit), QNU (Quadratic Neural Unit), RBF (Radial Basis Function), MLP (Multi Layer Perceptron), MLP-ELM (Multi Layer Perceptron - Extreme Learning Machine) neu

Filip Molcik 38 Dec 17, 2022
A denoising diffusion probabilistic model synthesises galaxies that are qualitatively and physically indistinguishable from the real thing.

Realistic galaxy simulation via score-based generative models Official code for 'Realistic galaxy simulation via score-based generative models'. We us

Michael Smith 32 Dec 20, 2022
Official implementation of UTNet: A Hybrid Transformer Architecture for Medical Image Segmentation

UTNet (Accepted at MICCAI 2021) Official implementation of UTNet: A Hybrid Transformer Architecture for Medical Image Segmentation Introduction Transf

110 Jan 01, 2023
Monocular 3D pose estimation. OpenVINO. CPU inference or iGPU (OpenCL) inference.

human-pose-estimation-3d-python-cpp RealSenseD435 (RGB) 480x640 + CPU Corei9 45 FPS (Depth is not used) 1. Run 1-1. RealSenseD435 (RGB) 480x640 + CPU

Katsuya Hyodo 8 Oct 03, 2022
Bayesian optimisation library developped by Huawei Noah's Ark Library

Bayesian Optimisation Research This directory contains official implementations for Bayesian optimisation works developped by Huawei R&D, Noah's Ark L

HUAWEI Noah's Ark Lab 395 Dec 30, 2022
A trashy useless Latin programming language written in python.

Codigum! The first programming langage in latin! (please keep your eyes closed when if you read the source code) It is pretty useless though. Document

Bic 2 Oct 25, 2021
This is the official code for the paper "Ad2Attack: Adaptive Adversarial Attack for Real-Time UAV Tracking".

Ad^2Attack:Adaptive Adversarial Attack on Real-Time UAV Tracking Demo video 📹 Our video on bilibili demonstrates the test results of Ad^2Attack on se

Intelligent Vision for Robotics in Complex Environment 10 Nov 07, 2022
The code written during my Bachelor Thesis "Classification of Human Whole-Body Motion using Hidden Markov Models".

This code was written during the course of my Bachelor thesis Classification of Human Whole-Body Motion using Hidden Markov Models. Some things might

Matthias Plappert 14 Dec 06, 2022
Code for the paper A Theoretical Analysis of the Repetition Problem in Text Generation

A Theoretical Analysis of the Repetition Problem in Text Generation This repository share the code for the paper "A Theoretical Analysis of the Repeti

Zihao Fu 37 Nov 21, 2022
Deduplicating Training Data Makes Language Models Better

Deduplicating Training Data Makes Language Models Better This repository contains code to deduplicate language model datasets as descrbed in the paper

Google Research 431 Dec 27, 2022
Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. It can use GPUs and perform efficient symbolic differentiation.

============================================================================================================ `MILA will stop developing Theano https:

9.6k Dec 31, 2022
Use evolutionary algorithms instead of gridsearch in scikit-learn

sklearn-deap Use evolutionary algorithms instead of gridsearch in scikit-learn. This allows you to reduce the time required to find the best parameter

rsteca 709 Jan 03, 2023
Fast image augmentation library and an easy-to-use wrapper around other libraries

Albumentations Albumentations is a Python library for image augmentation. Image augmentation is used in deep learning and computer vision tasks to inc

11.4k Jan 09, 2023
Code for unmixing audio signals in four different stems "drums, bass, vocals, others". The code is adapted from "Jukebox: A Generative Model for Music"

Status: Archive (code is provided as-is, no updates expected) Disclaimer This code is a based on "Jukebox: A Generative Model for Music" Paper We adju

Wadhah Zai El Amri 24 Dec 29, 2022
Official PyTorch implementation of SyntaSpeech (IJCAI 2022)

SyntaSpeech: Syntax-Aware Generative Adversarial Text-to-Speech | | | | 中文文档 This repository is the official PyTorch implementation of our IJCAI-2022

Zhenhui YE 116 Nov 24, 2022
A tensorflow/keras implementation of StyleGAN to generate images of new Pokemon.

PokeGAN A tensorflow/keras implementation of StyleGAN to generate images of new Pokemon. Dataset The model has been trained on dataset that includes 8

19 Jul 26, 2022
Simple Pixelbot for Diablo 2 Resurrected written in python and opencv.

Simple Pixelbot for Diablo 2 Resurrected written in python and opencv. Obviously only use it in offline mode as it is against the TOS of Blizzard to use it in online mode!

468 Jan 03, 2023
Easy genetic ancestry predictions in Python

ezancestry Easily visualize your direct-to-consumer genetics next to 2500+ samples from the 1000 genomes project. Evaluate the performance of a custom

Kevin Arvai 38 Jan 02, 2023
Face Mask Detection is a project to determine whether someone is wearing mask or not, using deep neural network.

face-mask-detection Face Mask Detection is a project to determine whether someone is wearing mask or not, using deep neural network. It contains 3 scr

amirsalar 13 Jan 18, 2022
Dataset para entrenamiento de yoloV3 para 4 clases

Deteccion de objetos en video Este repo basado en el proyecto PyTorch YOLOv3 para correr detección de objetos sobre video. Construí sobre este proyect

1 Nov 01, 2021