Kernel Point Convolutions

Related tags

Deep LearningKPConv
Overview

Intro figure

Created by Hugues THOMAS

Introduction

Update 27/04/2020: New PyTorch implementation available. With SemanticKitti, and Windows supported.

This repository contains the implementation of Kernel Point Convolution (KPConv), a point convolution operator presented in our ICCV2019 paper (arXiv). If you find our work useful in your research, please consider citing:

@article{thomas2019KPConv,
    Author = {Thomas, Hugues and Qi, Charles R. and Deschaud, Jean-Emmanuel and Marcotegui, Beatriz and Goulette, Fran{\c{c}}ois and Guibas, Leonidas J.},
    Title = {KPConv: Flexible and Deformable Convolution for Point Clouds},
    Journal = {Proceedings of the IEEE International Conference on Computer Vision},
    Year = {2019}
}

Update 03/05/2019, bug found with TF 1.13 and CUDA 10. We found an internal bug inside tf.matmul operation. It returns absurd values like 1e12, leading to the apparition of NaNs in our network. We advise to use the code with CUDA 9.0 and TF 1.12. More info in issue #15

SemanticKitti Code: You can download the code used for SemanticKitti submission here. It is not clean, has very few explanations, and and could be buggy. Use it only if you are familiar with KPConv implementation.

Installation

A step-by-step installation guide for Ubuntu 16.04 is provided in INSTALL.md. Windows is currently not supported as the code uses tensorflow custom operations.

Experiments

We provide scripts for many experiments. The instructions to run these experiments are in the doc folder.

  • Object Classification: Instructions to train KP-CNN on an object classification task (Modelnet40).

  • Object Segmentation: Instructions to train KP-FCNN on an object segmentation task (ShapeNetPart)

  • Scene Segmentation: Instructions to train KP-FCNN on several scene segmentation tasks (S3DIS, Scannet, Semantic3D, NPM3D).

  • New Dataset: Instructions to train KPConv networks on your own data.

  • Pretrained models: We provide pretrained weights and instructions to load them.

  • Visualization scripts: Instructions to use the three scripts allowing to visualize: the learned features, the kernel deformations and the Effective Receptive Fields.

Performances

The following tables report the current performances on different tasks and datasets. Some scores have been improved since the article submission.

Classification and segmentation of 3D shapes

Method ModelNet40 OA ShapeNetPart classes mIoU ShapeNetPart instances mIoU
KPConv rigid 92.9% 85.0% 86.2%
KPConv deform 92.7% 85.1% 86.4%

Segmentation of 3D scenes

Method Scannet mIoU Sem3D mIoU S3DIS mIoU NPM3D mIoU
KPConv rigid 68.6% 74.6% 65.4% 72.3%
KPConv deform 68.4% 73.1% 67.1% 82.0%

Acknowledgment

Our code uses the nanoflann library.

License

Our code is released under MIT License (see LICENSE file for details).

Updates

  • 17/02/2020: Added a link to SemanticKitti code
  • 24/01/2020: Bug fixes
  • 01/10/2019: Adding visualization scripts.
  • 23/09/2019: Adding pretrained models for NPM3D and S3DIS datasets.
  • 03/05/2019: Bug found with TF 1.13 and CUDA 10.
  • 19/04/2019: Initial release.
Owner
Hugues THOMAS
AI/robotics Researcher. Postdoc at University of Toronto. Focus: Deep Learning and 3D Point clouds. Indoor navigation
Hugues THOMAS
An example of Scatterbrain implementation (combining local attention and Performer)

An example of Scatterbrain implementation (combining local attention and Performer)

HazyResearch 97 Jan 02, 2023
BasicRL: easy and fundamental codes for deep reinforcement learning。It is an improvement on rainbow-is-all-you-need and OpenAI Spinning Up.

BasicRL: easy and fundamental codes for deep reinforcement learning BasicRL is an improvement on rainbow-is-all-you-need and OpenAI Spinning Up. It is

RayYoh 12 Apr 28, 2022
We simulate traveling back in time with a modern camera to rephotograph famous historical subjects.

[SIGGRAPH Asia 2021] Time-Travel Rephotography [Project Website] Many historical people were only ever captured by old, faded, black and white photos,

298 Jan 02, 2023
Language Models Can See: Plugging Visual Controls in Text Generation

Language Models Can See: Plugging Visual Controls in Text Generation Authors: Yixuan Su, Tian Lan, Yahui Liu, Fangyu Liu, Dani Yogatama, Yan Wang, Lin

Yixuan Su 195 Dec 22, 2022
A Gura parser implementation for Python

Gura Python parser This repository contains the implementation of a Gura (compliant with version 1.0.0) format parser in Python. Installation pip inst

Gura Config Lang 19 Jan 25, 2022
Implementation of the HMAX model of vision in PyTorch

PyTorch implementation of HMAX PyTorch implementation of the HMAX model that closely follows that of the MATLAB implementation of The Laboratory for C

Marijn van Vliet 52 Oct 13, 2022
The implementation of "Optimizing Shoulder to Shoulder: A Coordinated Sub-Band Fusion Model for Real-Time Full-Band Speech Enhancement"

SF-Net for fullband SE This is the repo of the manuscript "Optimizing Shoulder to Shoulder: A Coordinated Sub-Band Fusion Model for Real-Time Full-Ban

Guochen Yu 36 Dec 02, 2022
Learning from Guided Play: A Scheduled Hierarchical Approach for Improving Exploration in Adversarial Imitation Learning Source Code

Learning from Guided Play: A Scheduled Hierarchical Approach for Improving Exploration in Adversarial Imitation Learning Trevor Ablett*, Bryan Chan*,

STARS Laboratory 8 Sep 14, 2022
PyTorch Implementation of Spatially Consistent Representation Learning(SCRL)

Spatially Consistent Representation Learning (CVPR'21) Official PyTorch implementation of Spatially Consistent Representation Learning (SCRL). This re

Kakao Brain 102 Nov 03, 2022
Embeddinghub is a database built for machine learning embeddings.

Embeddinghub is a database built for machine learning embeddings.

Featureform 1.2k Jan 01, 2023
code for paper "Does Unsupervised Architecture Representation Learning Help Neural Architecture Search?"

Does Unsupervised Architecture Representation Learning Help Neural Architecture Search? Code for paper: Does Unsupervised Architecture Representation

39 Dec 17, 2022
Unsupervised captioning - Code for Unsupervised Image Captioning

Unsupervised Image Captioning by Yang Feng, Lin Ma, Wei Liu, and Jiebo Luo Introduction Most image captioning models are trained using paired image-se

Yang Feng 207 Dec 24, 2022
Code for the paper "Learning-Augmented Algorithms for Online Steiner Tree"

Learning-Augmented Algorithms for Online Steiner Tree This is the code for the paper "Learning-Augmented Algorithms for Online Steiner Tree". Requirem

0 Dec 09, 2021
HW3 ― GAN, ACGAN and UDA

HW3 ― GAN, ACGAN and UDA In this assignment, you are given datasets of human face and digit images. You will need to implement the models of both GAN

grassking100 1 Dec 13, 2021
PyTorch Implementation of DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs

DiffGAN-TTS - PyTorch Implementation PyTorch implementation of DiffGAN-TTS: High

Keon Lee 157 Jan 01, 2023
simple demo codes for Learning to Teach with Dynamic Loss Functions

Learning to Teach with Dynamic Loss Functions This repo contains the simple demo for the NeurIPS-18 paper: Learning to Teach with Dynamic Loss Functio

Lijun Wu 15 Dec 30, 2021
Heart Arrhythmia Classification

This program takes and input of an ECG in European Data Format (EDF) and outputs the classification for heartbeats into normal vs different types of arrhythmia . It uses a deep learning model for cla

4 Nov 02, 2022
Train Yolov4 using NBX-Jobs

yolov4-trainer-nbox Train Yolov4 using NBX-Jobs. Use the powerfull functionality available in nbox-SDK repo to train a tiny-Yolo v4 model on Pascal VO

Yash Bonde 1 Jan 12, 2022
SPT_LSA_ViT - Implementation for Visual Transformer for Small-size Datasets

Vision Transformer for Small-Size Datasets Seung Hoon Lee and Seunghyun Lee and Byung Cheol Song | Paper Inha University Abstract Recently, the Vision

Lee SeungHoon 87 Jan 01, 2023
A study project using the AA-RMVSNet to reconstruct buildings from multiple images

3d-building-reconstruction This is part of a study project using the AA-RMVSNet to reconstruct buildings from multiple images. Introduction It is exci

17 Oct 17, 2022