Implementation of the "Point 4D Transformer Networks for Spatio-Temporal Modeling in Point Cloud Videos" paper.

Last update: Dec 29, 2022

Related tags

Overview

Point 4D Transformer Networks for Spatio-Temporal Modeling in Point Cloud Videos

Introduction

Point cloud videos exhibit irregularities and lack of order along the spatial dimension where points emerge inconsistently across different frames. To capture the dynamics in point cloud videos, point tracking is usually employed. However, as points may flow in and out across frames, computing accurate point trajectories is extremely difficult. Moreover, tracking usually relies on point colors and thus may fail to handle colorless point clouds. In this paper, to avoid point tracking, we propose a novel Point 4D Transformer (P4Transformer) network to model raw point cloud videos. Specifically, P4Transformer consists of (i) a point 4D convolution to embed the spatio-temporal local structures presented in a point cloud video and (ii) a transformer to capture the appearance and motion information across the entire video by performing self-attention on the embedded local features. In this fashion, related or similar local areas are merged with attention weight rather than by explicit tracking.

Installation

The code is tested with Red Hat Enterprise Linux Workstation release 7.7 (Maipo), g++ (GCC) 8.3.1, PyTorch (both v1.4.0 and v1.8.1 are supported), CUDA 10.2 and cuDNN v7.6.

Compile the CUDA layers for PointNet++, which we used for furthest point sampling (FPS) and radius neighbouring search:

mv modules-pytorch-1.4.0/modules-pytorch-1.8.1 modules
cd modules
python setup.py install

Citation

If you find our work useful in your research, please consider citing:

@inproceedings{fan21p4transformer,
  author    = {Hehe Fan and
               Yi Yang and
               Mohan Kankanhalli},
  title     = {Point 4D Transformer Networks for Spatio-Temporal Modeling in Point Cloud Videos},
  booktitle = {{IEEE/CVF} Conference on Computer Vision and Pattern Recognition, {CVPR}},
  year      = {2021}
}

Related Repos

PointNet++ PyTorch implementation: https://github.com/facebookresearch/votenet/tree/master/pointnet2
MeteorNet: https://github.com/xingyul/meteornet
3DV: https://github.com/3huo/3DV-Action
PSTNet: https://github.com/hehefan/Point-Spatio-Temporal-Convolution
Transformer: https://github.com/lucidrains/vit-pytorch
PointRNN (TensorFlow implementation): https://github.com/hehefan/PointRNN
PointRNN (PyTorch implementation): https://github.com/hehefan/PointRNN-PyTorch

Implementation of the "Point 4D Transformer Networks for Spatio-Temporal Modeling in Point Cloud Videos" paper.

Related tags

Overview

Point 4D Transformer Networks for Spatio-Temporal Modeling in Point Cloud Videos

Introduction

Installation

Citation

Related Repos

Owner

Hehe Fan

[ICCV21] Self-Calibrating Neural Radiance Fields

A simple pygame dino game which can also be trained and played by a NEAT KI

Tensorflow implementation of "Learning Deep Features for Discriminative Localization"

A library for preparing, training, and evaluating scalable deep learning hybrid recommender systems using PyTorch.

CNNs for Sentence Classification in PyTorch

Shape-Adaptive Selection and Measurement for Oriented Object Detection

An efficient and easy-to-use deep learning model compression framework

CausalNLP is a practical toolkit for causal inference with text as treatment, outcome, or "controlled-for" variable.

Spectral Tensor Train Parameterization of Deep Learning Layers

HistoKT: Cross Knowledge Transfer in Computational Pathology

thundernet ncnn

The official PyTorch code for NeurIPS 2021 ML4AD Paper, "Does Thermal data make the detection systems more reliable?"

CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped

Code to produce syntactic representations that can be used to study syntax processing in the human brain

Streaming over lightweight data transformations

Python library for science observations from the James Webb Space Telescope

The first dataset on shadow generation for the foreground object in real-world scenes.

Simple Pixelbot for Diablo 2 Resurrected written in python and opencv.

The easiest tool for extracting radiomics features and training ML models on them.

a short visualisation script for pyvideo data