PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation

Last update: Dec 30, 2022

Overview

PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation

Created by Charles R. Qi, Hao Su, Kaichun Mo, Leonidas J. Guibas from Stanford University.

Introduction

This work is based on our arXiv tech report, which is going to appear in CVPR 2017. We proposed a novel deep net architecture for point clouds (as unordered point sets). You can also check our project webpage for a deeper introduction.

Point cloud is an important type of geometric data structure. Due to its irregular format, most researchers transform such data to regular 3D voxel grids or collections of images. This, however, renders data unnecessarily voluminous and causes issues. In this paper, we design a novel type of neural network that directly consumes point clouds, which well respects the permutation invariance of points in the input. Our network, named PointNet, provides a unified architecture for applications ranging from object classification, part segmentation, to scene semantic parsing. Though simple, PointNet is highly efficient and effective.

In this repository, we release code and data for training a PointNet classification network on point clouds sampled from 3D shapes, as well as for training a part segmentation network on ShapeNet Part dataset.

Citation

If you find our work useful in your research, please consider citing:

@article{qi2016pointnet,
  title={PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation},
  author={Qi, Charles R and Su, Hao and Mo, Kaichun and Guibas, Leonidas J},
  journal={arXiv preprint arXiv:1612.00593},
  year={2016}
}

Installation

Install TensorFlow. You may also need to install h5py. The code has been tested with Python 2.7, TensorFlow 1.0.1, CUDA 8.0 and cuDNN 5.1 on Ubuntu 14.04.

If you are using PyTorch, you can find a third-party pytorch implementation here.

To install h5py for Python:

sudo apt-get install libhdf5-dev
sudo pip install h5py

Usage

To train a model to classify point clouds sampled from 3D shapes:

python train.py

Log files and network parameters will be saved to log folder in default. Point clouds of ModelNet40 models in HDF5 files will be automatically downloaded (416MB) to the data folder. Each point cloud contains 2048 points uniformly sampled from a shape surface. Each cloud is zero-mean and normalized into an unit sphere. There are also text files in data/modelnet40_ply_hdf5_2048 specifying the ids of shapes in h5 files.

To see HELP for the training script:

python train.py -h

We can use TensorBoard to view the network architecture and monitor the training progress.

tensorboard --logdir log

After the above training, we can evaluate the model and output some visualizations of the error cases.

python evaluate.py --visu

Point clouds that are wrongly classified will be saved to dump folder in default. We visualize the point cloud by rendering it into three-view images.

If you'd like to prepare your own data, you can refer to some helper functions in utils/data_prep_util.py for saving and loading HDF5 files.

Part Segmentation

To train a model for object part segmentation, firstly download the data:

cd part_seg
sh download_data.sh

The downloading script will download ShapeNetPart dataset (around 1.08GB) and our prepared HDF5 files (around 346MB).

Then you can run train.py and test.py in the part_seg folder for training and testing (computing mIoU for evaluation).

License

Our code is released under MIT License (see LICENSE file for details).

Selected Projects that Use PointNet

PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space by Qi et al. (NIPS 2017) A hierarchical feature learning framework on point clouds. The PointNet++ architecture applies PointNet recursively on a nested partitioning of the input point set. It also proposes novel layers for point clouds with non-uniform densities.
Exploring Spatial Context for 3D Semantic Segmentation of Point Clouds by Engelmann et al. (ICCV 2017 workshop). This work extends PointNet for large-scale scene segmentation.
PCPNET: Learning Local Shape Properties from Raw Point Clouds by Guerrero et al. (arXiv). The work adapts PointNet for local geometric properties (e.g. normal and curvature) estimation in noisy point clouds.
VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection by Zhou et al. from Apple (arXiv) This work studies 3D object detection using LiDAR point clouds. It splits space into voxels, use PointNet to learn local voxel features and then use 3D CNN for region proposal, object classification and 3D bounding box estimation.
Frustum PointNets for 3D Object Detection from RGB-D Data by Qi et al. (arXiv) A novel framework for 3D object detection with RGB-D data. The method proposed has achieved first place on KITTI 3D object detection benchmark on all categories (last checked on 11/30/2017).

PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation

Related tags

Overview

PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation

Introduction

Citation

Installation

Usage

Part Segmentation

License

Selected Projects that Use PointNet

Owner

Charles R. Qi

(ICCV 2021) PyTorch implementation of Paper "Progressive Correspondence Pruning by Consensus Learning"

Multi-Scale Geometric Consistency Guided Multi-View Stereo

Perturbed Self-Distillation: Weakly Supervised Large-Scale Point Cloud Semantic Segmentation (ICCV2021)

Implementation of Self-supervised Graph-level Representation Learning with Local and Global Structure (ICML 2021).

Credit fraud detection in Python using a Jupyter Notebook

Pytorch Implementation of "Diagonal Attention and Style-based GAN for Content-Style disentanglement in image generation and translation" (ICCV 2021)

AI创造营：Metaverse启动机之重构现世，结合PaddlePaddle 和 Wechaty 创造自己的聊天机器人

Open source simulator for autonomous vehicles built on Unreal Engine / Unity, from Microsoft AI & Research

Data-depth-inference - Data depth inference with python

g9.py - Torch interactive graphics

Unsupervised Attributed Multiplex Network Embedding (AAAI 2020)

HyperPose is a library for building high-performance custom pose estimation applications.

A python script to dump all the challenges locally of a CTFd-based Capture the Flag.

Mesh Graphormer is a new transformer-based method for human pose and mesh reconsruction from an input image

Regularizing Generative Adversarial Networks under Limited Data (CVPR 2021)

Locationinfo - A script helps the user to show network information such as ip address

Hepsiburada - Hepsiburada Urun Bilgisi Cekme

The implementation of the paper "A Deep Feature Aggregation Network for Accurate Indoor Camera Localization".

Contour-guided image completion with perceptual grouping (BMVC 2021 publication)

Dark Finix: All in one hacking framework with almost 100 tools

PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation

Related tags

Overview

PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation

Introduction

Citation

Installation

Usage

Part Segmentation

License

Selected Projects that Use PointNet

Owner

Charles R. Qi

(ICCV 2021) PyTorch implementation of Paper "Progressive Correspondence Pruning by Consensus Learning"

Multi-Scale Geometric Consistency Guided Multi-View Stereo

Perturbed Self-Distillation: Weakly Supervised Large-Scale Point Cloud Semantic Segmentation (ICCV2021)

Implementation of Self-supervised Graph-level Representation Learning with Local and Global Structure (ICML 2021).

Credit fraud detection in Python using a Jupyter Notebook

Pytorch Implementation of "Diagonal Attention and Style-based GAN for Content-Style disentanglement in image generation and translation" (ICCV 2021)

AI创造营 ：Metaverse启动机之重构现世，结合PaddlePaddle 和 Wechaty 创造自己的聊天机器人

Open source simulator for autonomous vehicles built on Unreal Engine / Unity, from Microsoft AI & Research

Data-depth-inference - Data depth inference with python

g9.py - Torch interactive graphics

Unsupervised Attributed Multiplex Network Embedding (AAAI 2020)

HyperPose is a library for building high-performance custom pose estimation applications.

A python script to dump all the challenges locally of a CTFd-based Capture the Flag.

Mesh Graphormer is a new transformer-based method for human pose and mesh reconsruction from an input image

Regularizing Generative Adversarial Networks under Limited Data (CVPR 2021)

Locationinfo - A script helps the user to show network information such as ip address

Hepsiburada - Hepsiburada Urun Bilgisi Cekme

The implementation of the paper "A Deep Feature Aggregation Network for Accurate Indoor Camera Localization".

Contour-guided image completion with perceptual grouping (BMVC 2021 publication)

Dark Finix: All in one hacking framework with almost 100 tools

AI创造营：Metaverse启动机之重构现世，结合PaddlePaddle 和 Wechaty 创造自己的聊天机器人