PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation

Overview

PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation

Created by Charles R. Qi, Hao Su, Kaichun Mo, Leonidas J. Guibas from Stanford University.

prediction example

Introduction

This work is based on our arXiv tech report, which is going to appear in CVPR 2017. We proposed a novel deep net architecture for point clouds (as unordered point sets). You can also check our project webpage for a deeper introduction.

Point cloud is an important type of geometric data structure. Due to its irregular format, most researchers transform such data to regular 3D voxel grids or collections of images. This, however, renders data unnecessarily voluminous and causes issues. In this paper, we design a novel type of neural network that directly consumes point clouds, which well respects the permutation invariance of points in the input. Our network, named PointNet, provides a unified architecture for applications ranging from object classification, part segmentation, to scene semantic parsing. Though simple, PointNet is highly efficient and effective.

In this repository, we release code and data for training a PointNet classification network on point clouds sampled from 3D shapes, as well as for training a part segmentation network on ShapeNet Part dataset.

Citation

If you find our work useful in your research, please consider citing:

@article{qi2016pointnet,
  title={PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation},
  author={Qi, Charles R and Su, Hao and Mo, Kaichun and Guibas, Leonidas J},
  journal={arXiv preprint arXiv:1612.00593},
  year={2016}
}

Installation

Install TensorFlow. You may also need to install h5py. The code has been tested with Python 2.7, TensorFlow 1.0.1, CUDA 8.0 and cuDNN 5.1 on Ubuntu 14.04.

If you are using PyTorch, you can find a third-party pytorch implementation here.

To install h5py for Python:

sudo apt-get install libhdf5-dev
sudo pip install h5py

Usage

To train a model to classify point clouds sampled from 3D shapes:

python train.py

Log files and network parameters will be saved to log folder in default. Point clouds of ModelNet40 models in HDF5 files will be automatically downloaded (416MB) to the data folder. Each point cloud contains 2048 points uniformly sampled from a shape surface. Each cloud is zero-mean and normalized into an unit sphere. There are also text files in data/modelnet40_ply_hdf5_2048 specifying the ids of shapes in h5 files.

To see HELP for the training script:

python train.py -h

We can use TensorBoard to view the network architecture and monitor the training progress.

tensorboard --logdir log

After the above training, we can evaluate the model and output some visualizations of the error cases.

python evaluate.py --visu

Point clouds that are wrongly classified will be saved to dump folder in default. We visualize the point cloud by rendering it into three-view images.

If you'd like to prepare your own data, you can refer to some helper functions in utils/data_prep_util.py for saving and loading HDF5 files.

Part Segmentation

To train a model for object part segmentation, firstly download the data:

cd part_seg
sh download_data.sh

The downloading script will download ShapeNetPart dataset (around 1.08GB) and our prepared HDF5 files (around 346MB).

Then you can run train.py and test.py in the part_seg folder for training and testing (computing mIoU for evaluation).

License

Our code is released under MIT License (see LICENSE file for details).

Selected Projects that Use PointNet

Owner
Charles R. Qi
AI Researcher. PhD from Stanford University. Focus: deep learning, computer vision and 3D.
Charles R. Qi
Hierarchical User Intent Graph Network for Multimedia Recommendation

Hierarchical User Intent Graph Network for Multimedia Recommendation This is our Pytorch implementation for the paper: Hierarchical User Intent Graph

6 Jan 05, 2023
PyTorch implementation for paper StARformer: Transformer with State-Action-Reward Representations.

StARformer This repository contains the PyTorch implementation for our paper titled StARformer: Transformer with State-Action-Reward Representations.

Jinghuan Shang 14 Dec 09, 2022
Bootstrapped Representation Learning on Graphs

Bootstrapped Representation Learning on Graphs This is the PyTorch implementation of BGRL Bootstrapped Representation Learning on Graphs The main scri

NerDS Lab :: Neural Data Science Lab 55 Jan 07, 2023
Spontaneous Facial Micro Expression Recognition using 3D Spatio-Temporal Convolutional Neural Networks

Spontaneous Facial Micro Expression Recognition using 3D Spatio-Temporal Convolutional Neural Networks Abstract Facial expression recognition in video

Bogireddy Sai Prasanna Teja Reddy 103 Dec 29, 2022
Explainability for Vision Transformers (in PyTorch)

Explainability for Vision Transformers (in PyTorch) This repository implements methods for explainability in Vision Transformers

Jacob Gildenblat 442 Jan 04, 2023
ICCV2021 - Mining Contextual Information Beyond Image for Semantic Segmentation

Introduction The official repository for "Mining Contextual Information Beyond Image for Semantic Segmentation". Our full code has been merged into ss

55 Nov 09, 2022
Code for a real-time distributed cooperative slam(RDC-SLAM) system for ROS compatible platforms.

RDC-SLAM This repository contains code for a real-time distributed cooperative slam(RDC-SLAM) system for ROS compatible platforms. The system takes in

40 Nov 19, 2022
Pyramid Scene Parsing Network, CVPR2017.

Pyramid Scene Parsing Network by Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, Jiaya Jia, details are in project page. Introduction This

Hengshuang Zhao 1.5k Jan 05, 2023
🌊 Online machine learning in Python

In a nutshell River is a Python library for online machine learning. It is the result of a merger between creme and scikit-multiflow. River's ambition

OnlineML 4k Jan 02, 2023
SwinTrack: A Simple and Strong Baseline for Transformer Tracking

SwinTrack This is the official repo for SwinTrack. A Simple and Strong Baseline Prerequisites Environment conda (recommended) conda create -y -n SwinT

LitingLin 196 Jan 04, 2023
Official implementation of VQ-Diffusion

Official implementation of VQ-Diffusion: Vector Quantized Diffusion Model for Text-to-Image Synthesis

Microsoft 592 Jan 03, 2023
True Few-Shot Learning with Language Models

This codebase supports using language models (LMs) for true few-shot learning: learning to perform a task using a limited number of examples from a single task distribution.

Ethan Perez 124 Jan 04, 2023
All public open-source implementations of convnets benchmarks

convnet-benchmarks Easy benchmarking of all public open-source implementations of convnets. A summary is provided in the section below. Machine: 6-cor

Soumith Chintala 2.7k Dec 30, 2022
ONNX-GLPDepth - Python scripts for performing monocular depth estimation using the GLPDepth model in ONNX

ONNX-GLPDepth - Python scripts for performing monocular depth estimation using the GLPDepth model in ONNX

Ibai Gorordo 18 Nov 06, 2022
Detecting and Tracking Small and Dense Moving Objects in Satellite Videos: A Benchmark

This dataset is a large-scale dataset for moving object detection and tracking in satellite videos, which consists of 40 satellite videos captured by Jilin-1 satellite platforms.

Qingyong 87 Dec 22, 2022
Data and code for ICCV 2021 paper Distant Supervision for Scene Graph Generation.

Distant Supervision for Scene Graph Generation Data and code for ICCV 2021 paper Distant Supervision for Scene Graph Generation. Introduction The pape

THUNLP 23 Dec 31, 2022
ESL: Event-based Structured Light

ESL: Event-based Structured Light Video (click on the image) This is the code for the 2021 3DV paper ESL: Event-based Structured Light by Manasi Mugli

Robotics and Perception Group 29 Oct 24, 2022
Run PowerShell command without invoking powershell.exe

PowerLessShell PowerLessShell rely on MSBuild.exe to remotely execute PowerShell scripts and commands without spawning powershell.exe. You can also ex

Mr.Un1k0d3r 1.2k Jan 03, 2023
Random Erasing Data Augmentation. Experiments on CIFAR10, CIFAR100 and Fashion-MNIST

Random Erasing Data Augmentation =============================================================== black white random This code has the source code for

Zhun Zhong 654 Dec 26, 2022
Generalized Proximal Policy Optimization with Sample Reuse (GePPO)

Generalized Proximal Policy Optimization with Sample Reuse This repository is the official implementation of the reinforcement learning algorithm Gene

Jimmy Queeney 9 Nov 28, 2022