Attention-based Transformation from Latent Features to Point Clouds (AAAI 2022)

Last update: Nov 11, 2022

Related tags

Overview

Attention-based Transformation from Latent Features to Point Clouds

This repository contains a PyTorch implementation of the paper:

Attention-based Transformation from Latent Features to Point Clouds
Kaiyi Zhang, Ximing Yang, Yuan Wu, Cheng Jin
AAAI 2022

Introduction

In point cloud generation and completion, previous methods for transforming latent features to point clouds are generally based on fully connected layers (FC-based) or folding operations (Folding-based). However, point clouds generated by FC-based methods are usually troubled by outliers and rough surfaces. For folding-based methods, their data flow is large, convergence speed is slow, and they are also hard to handle the generation of non-smooth surfaces. In this work, we propose AXform, an attention-based method to transform latent features to point clouds. AXform first generates points in an interim space, using a fully connected layer. These interim points are then aggregated to generate the target point cloud. AXform takes both parameter sharing and data flow into account, which makes it has fewer outliers, fewer network parameters, and a faster convergence speed. The points generated by AXform do not have the strong 2-manifold constraint, which improves the generation of non-smooth surfaces. When AXform is expanded to multiple branches for local generations, the centripetal constraint makes it has properties of self-clustering and space consistency, which further enables unsupervised semantic segmentation. We also adopt this scheme and design AXformNet for point cloud completion. Considerable experiments on different datasets show that our methods achieve state-of-the-art results.

Dependencies

Python 3.6
CUDA 10.0
G++ or GCC 7.5
PyTorch. Codes are tested with version 1.6.0
(Optional) Visdom for visualization of the training process

Install all the following tools based on CUDA.

cd utils/furthestPointSampling
python3 setup.py install

# https://github.com/stevenygd/PointFlow/tree/master/metrics
cd utils/metrics/pytorch_structural_losses
make

# https://github.com/sshaoshuai/Pointnet2.PyTorch
cd utils/Pointnet2.PyTorch/pointnet2
python3 setup.py install

# https://github.com/daerduoCarey/PyTorchEMD
cd utils/PyTorchEMD
python3 setup.py install

# not used
cd utils/randPartial
python3 setup.py install

Datasets

PCN dataset (Google Drive) are used for point cloud completion.

ShapeNetCore.v2.PC2048 (Google Drive) are used for the other tasks. The point clouds are uniformly sampled from the meshes in ShapeNetCore dataset (version 2). All the point clouds are centered and scaled to [-0.5, 0.5]. We follow the official split. The sample code based on PyTorch3D can be found in utils/sample_pytorch3d.py.

Please download them to the data directory.

Training

All the arguments, e.g. gpu_ids, mode, method, hparas, num_branch, class_choice, visual, can be adjusted before training. For example:

# axform, airplane category, 16 branches
python3 axform.py --mode train --num_branch 16 --class_choice ['airplane']

# fc-based, car category
python3 models/fc_folding.py --mode train --method fc-based --class_choice ['car']

# l-gan, airplane category, not use axform
python3 models/latent_3d_points/l-gan.py --mode train --method original --class_choice ['airplane'] --ae_ckpt_path path_to_ckpt_autoencoder.pth

# axformnet, all categories, integrated
python3 axformnet.py --mode train --method integrated --class_choice None

Pre-trained models

Here we provide pre-trained models (Google Drive) for point cloud completion. The following is the suggested way to evaluate the performance of the pre-trained models.

# vanilla
python3 axformnet.py --mode test --method vanilla --ckpt_path path_to_ckpt_vanilla.pth

# integrated
python3 axformnet.py --mode test --method integrated --ckpt_path path_to_ckpt_integrated.pth

Visualization

Matplotlib is used for the visualization of results in the paper. Code for reference can be seen in utils/draw.py.

Here we recommend using Mitsuba 2 for visualization. An example code can be found in Point Cloud Renderer.

Citation

Please cite our work if you find it useful:

@article{zhang2021axform,
 title={Attention-based Transformation from Latent Features to Point Clouds},
 author={Zhang, Kaiyi and Yang, Ximing, and Wu, Yuan and Jin, Cheng},
 journal={arXiv preprint arXiv:2112.05324},
 year={2021}
}

License

This project Code is released under the MIT License (refer to the LICENSE file for details).

Attention-based Transformation from Latent Features to Point Clouds (AAAI 2022)

Related tags

Overview

Attention-based Transformation from Latent Features to Point Clouds

Introduction

Dependencies

Datasets

Training

Pre-trained models

Visualization

Citation

License

Owner

TalkingHead-1KH is a talking-head dataset consisting of YouTube videos

TrTr: Visual Tracking with Transformer

An implementation of Equivariant e2 convolutional kernals into a convolutional self attention network, applied to radio astronomy data.

pytorch implementation for PointNet

alfred-py: A deep learning utility library for human

Official implementation of DreamerPro: Reconstruction-Free Model-Based Reinforcement Learning with Prototypical Representations in TensorFlow 2

This's an implementation of deepmind Visual Interaction Networks paper using pytorch

This repository is for DSA and CP scripts for reference.

Let Python optimize the best stop loss and take profits for your TradingView strategy.

Introducing neural networks to predict stock prices

Birthday-problem - The birthday problem asks for the probability that, in a set of n randomly chosen people, at least two will share a birthday

Sharpness-Aware Minimization for Efficiently Improving Generalization

The source code and dataset for the RecGURU paper (WSDM 2022)

[NIPS 2021] UOTA: Improving Self-supervised Learning with Automated Unsupervised Outlier Arbitration.

This code is a toolbox that uses Torch library for training and evaluating the ERFNet architecture for semantic segmentation.

MATLAB codes of the book "Digital Image Processing Fourth Edition" converted to Python

Individual Tree Crown classification on WorldView-2 Images using Autoencoder -- Group 9 Weak learners - Final Project (Machine Learning 2020 Course)

3DMV jointly combines RGB color and geometric information to perform 3D semantic segmentation of RGB-D scans.

Per-Pixel Classification is Not All You Need for Semantic Segmentation

A curated list of awesome resources related to Semantic Search🔎 and Semantic Similarity tasks.

Attention-based Transformation from Latent Features to Point Clouds (AAAI 2022)

Related tags

Overview

Attention-based Transformation from Latent Features to Point Clouds

Introduction

Dependencies

Datasets

Training

Pre-trained models

Visualization

Citation

License

Owner

TalkingHead-1KH is a talking-head dataset consisting of YouTube videos

TrTr: Visual Tracking with Transformer

An implementation of Equivariant e2 convolutional kernals into a convolutional self attention network, applied to radio astronomy data.

pytorch implementation for PointNet

alfred-py: A deep learning utility library for **human**

Official implementation of DreamerPro: Reconstruction-Free Model-Based Reinforcement Learning with Prototypical Representations in TensorFlow 2

This's an implementation of deepmind Visual Interaction Networks paper using pytorch

This repository is for DSA and CP scripts for reference.

Let Python optimize the best stop loss and take profits for your TradingView strategy.

Introducing neural networks to predict stock prices

Birthday-problem - The birthday problem asks for the probability that, in a set of n randomly chosen people, at least two will share a birthday

Sharpness-Aware Minimization for Efficiently Improving Generalization

The source code and dataset for the RecGURU paper (WSDM 2022)

[NIPS 2021] UOTA: Improving Self-supervised Learning with Automated Unsupervised Outlier Arbitration.

This code is a toolbox that uses Torch library for training and evaluating the ERFNet architecture for semantic segmentation.

MATLAB codes of the book "Digital Image Processing Fourth Edition" converted to Python

Individual Tree Crown classification on WorldView-2 Images using Autoencoder -- Group 9 Weak learners - Final Project (Machine Learning 2020 Course)

3DMV jointly combines RGB color and geometric information to perform 3D semantic segmentation of RGB-D scans.

Per-Pixel Classification is Not All You Need for Semantic Segmentation

A curated list of awesome resources related to Semantic Search🔎 and Semantic Similarity tasks.

alfred-py: A deep learning utility library for human