Public repository of the 3DV 2021 paper "Generative Zero-Shot Learning for Semantic Segmentation of 3D Point Clouds"

Related tags

Deep Learning3DGenZ
Overview

Generative Zero-Shot Learning for Semantic Segmentation of 3D Point Clouds

Björn Michele1), Alexandre Boulch1), Gilles Puy1), Maxime Bucher1) and Renaud Marlet1)2)

1) Valeo.ai 2)LIGM, Ecole des Ponts, Univ Gustave Eiffel, CNRS, Marne-la-Vallée, Franc

Accepted at 3DV 2021
Arxiv: Paper and Supp.
Poster or Presentation

Abstract: While there has been a number of studies on Zero-Shot Learning (ZSL) for 2D images, its application to 3D data is still recent and scarce, with just a few methods limited to classification. We present the first generative approach for both ZSL and Generalized ZSL (GZSL) on 3D data, that can handle both classification and, for the first time, semantic segmentation. We show that it reaches or outperforms the state of the art on ModelNet40 classification for both inductive ZSL and inductive GZSL. For semantic segmentation, we created three benchmarks for evaluating this new ZSL task, using S3DIS, ScanNet and SemanticKITTI. Our experiments show that our method outperforms strong baselines, which we additionally propose for this task.

If you want to cite this work:

@inproceedings{michele2021generative,
  title={Generative Zero-Shot Learning for Semantic Segmentation of {3D} Point Cloud},
  author={Michele, Bj{\"o}rn and Boulch, Alexandre and Puy, Gilles and Bucher, Maxime and Marlet, Renaud},
  booktitle={International Conference on 3D Vision (3DV)},
  year={2021}

Code

We provide in this repository the code and the pretrained models for the semantic segmentation tasks on SemanticKITTI and ScanNet.

To-Do:

  • We will add more experiments in the future (You could "watch" the repo to stay updated).

Code Semantic Segmentation

Installation

Dependencies: Please see requirements.txt for all needed code libraries. Tested with: Pytorch 1.6.0 and 1.7.1 (both Cuda 10.1). As torch-geometric is needed Pytoch >= 1.4.0 is required.

  1. Clone this repository.

  2. Download and/or install the backbones (ConvPoint is also necessary for our adaption of FKAConv. More information: ConvPoint, FKAConv, KP-Conv).

    • For ConvPoint:
    cd 3DGenZ/genz3d/convpoint/convpoint/knn
    python3 setup.py install --home="."
    
    • For FKAConv:
    cd 3DGenZ/genz3d/fkaconv
    pip install -ve . 
    
  3. Download the datasets.

    • For an out of the box start we recommend the following folder structure.
    ~/3DGenZ
    ~/data/scannet/
    ~/data/semantic_kitti/
    
  4. Download the semantic word embeddings and the pretrained backbones.

    • Place the semantic word embeddings in
    3DGenZ/genz3d/word_representations/
    
    • For SN, the pre-trained backbone model and the config file, are placed in
    3DGenZ/genz3d/fkaconv/examples/scannet/FKAConv_scannet_ZSL4
    

    The complete ZSL-trained model cpkt is placed in (create the folder if necessary)

    3DGenZ/genz3d/seg/run/scannet/
    
    • For SK, the pre-trained backbone-model, the "Log-..." folder is placed in
    3DGenZ/genz3d/kpconv/results
    

    And the complete ZSL-trained model ckpt is placed in

    3DGenZ/genz3d/seg/run/sk
    

Run training and evalutation

  1. Training (Classifier layer): In 3DGenZ/genz3d/seg/ you find for each of the datasets a folder with scripts to run the generator and classificator training.(see: SN,SK)
    • Alternatively, you can use the pretrained models from us.
  2. Evalutation: Is done with the evaluation functions of the backbones. (see: SN_eval, KP-Conv_eval)

Backbones

For the datasets we used different backbones, for which we highly rely on their code basis. In order to adapt them to the ZSL setting we made the change that during the backbone training no crops of point clouds with unseen classes are shown (if there is a single unseen class

  • ConvPoint [1] for the S3DIS dataset (and also partly used for the ScanNet dataset).
  • FKAConv [2] for the ScanNet dataset.
  • KPConv [3] for the SemanticKITTI dataset.

Datasets

For semantic segmentation we did experiments on 3 datasets.

  • SemanticKITTI [4][5].
  • S3DIS [6].
  • ScanNet[7].

Acknowledgements

For the Generator Training we use parts of the code basis of ZS3.
For the backbones we use the code of ConvPoint, FKAConv and KPConv.

References

[1] Boulch, A. (2020). ConvPoint: Continuous convolutions for point cloud processing. Computers & Graphics, 88, 24-34.
[2] Boulch, A., Puy, G., & Marlet, R. (2020). FKAConv: Feature-kernel alignment for point cloud convolution. In Proceedings of the Asian Conference on Computer Vision.
[3] Thomas, H., Qi, C. R., Deschaud, J. E., Marcotegui, B., Goulette, F., & Guibas, L. J. (2019). Kpconv: Flexible and deformable convolution for point clouds. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 6411-6420).
[4] Behley, J., Garbade, M., Milioto, A., Quenzel, J., Behnke, S., Stachniss, C., & Gall, J. (2019). Semantickitti: A dataset for semantic scene understanding of lidar sequences. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 9297-9307).
[5] Geiger, A., Lenz, P., & Urtasun, R. (2012, June). Are we ready for autonomous driving? the kitti vision benchmark suite. In 2012 IEEE conference on computer vision and pattern recognition (pp. 3354-3361). IEEE.
[6] Armeni, I., Sener, O., Zamir, A. R., Jiang, H., Brilakis, I., Fischer, M., & Savarese, S. (2016). 3d semantic parsing of large-scale indoor spaces. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1534-1543).
[7] Dai, A., Chang, A. X., Savva, M., Halber, M., Funkhouser, T., & Nießner, M. (2017). Scannet: Richly-annotated 3d reconstructions of indoor scenes. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 5828-5839).

Updates

9.12.2021 Initial Code release

Licence

3DGenZ is released under the Apache 2.0 license.

The folder 3DGenZ/genz3d/kpconv includes large parts of code taken from KP-Conv and is therefore distributed under the MIT Licence. See the LICENSE for this folder.

The folder 3DGenZ/genz3d/seg/utils also includes files taken from https://github.com/jfzhang95/pytorch-deeplab-xception and is therefore also distributed under the MIT License. See the LICENSE for these files.

Owner
valeo.ai
We are an international team based in Paris, conducting AI research for Valeo automotive applications, in collaboration with world-class academics.
valeo.ai
Official Implementation for "StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery" (ICCV 2021 Oral)

StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery (ICCV 2021 Oral) Run this model on Replicate Optimization: Global directions: Mapper: Check ou

3.3k Jan 05, 2023
Python PID Tuner - Makes a model of the System from a Process Reaction Curve and calculates PID Gains

PythonPID_Tuner_SOPDT Step 1: Takes a Process Reaction Curve in csv format - assumes data at 100ms interval (column names CV and PV) Step 2: Makes a r

1 Jan 18, 2022
Using machine learning to predict and analyze high and low reader engagement for New York Times articles posted to Facebook.

How The New York Times can increase Engagement on Facebook Using machine learning to understand characteristics of news content that garners "high" Fa

Jessica Miles 0 Sep 16, 2021
Simple image captioning model - CLIP prefix captioning.

CLIP prefix captioning. Inference Notebook: 🥳 New: 🥳 Our technical papar is finally out! Official implementation for the paper "ClipCap: CLIP Prefix

688 Jan 04, 2023
Implementation of Barlow Twins paper

barlowtwins PyTorch Implementation of Barlow Twins paper: Barlow Twins: Self-Supervised Learning via Redundancy Reduction This is currently a work in

IgorSusmelj 86 Dec 20, 2022
A deep-learning pipeline for segmentation of ambiguous microscopic images.

Welcome to Official repository of deepflash2 - a deep-learning pipeline for segmentation of ambiguous microscopic images. Quick Start in 30 seconds se

Matthias Griebel 39 Dec 19, 2022
DGCNN - Dynamic Graph CNN for Learning on Point Clouds

DGCNN is the author's re-implementation of Dynamic Graph CNN, which achieves state-of-the-art performance on point-cloud-related high-level tasks including category classification, semantic segmentat

Wang, Yue 1.3k Dec 26, 2022
An official repository for Paper "Uformer: A General U-Shaped Transformer for Image Restoration".

Uformer: A General U-Shaped Transformer for Image Restoration Zhendong Wang, Xiaodong Cun, Jianmin Bao and Jianzhuang Liu Paper: https://arxiv.org/abs

Zhendong Wang 497 Dec 22, 2022
Dynamic View Synthesis from Dynamic Monocular Video

Dynamic View Synthesis from Dynamic Monocular Video Project Website | Video | Paper Dynamic View Synthesis from Dynamic Monocular Video Chen Gao, Ayus

Chen Gao 139 Dec 28, 2022
Repository for MeshTalk supplemental material and code once the (already approved) 16 GHS captures our lab will make publicly available are released.

meshtalk This repository contains code to run MeshTalk for face animation from audio. If you use MeshTalk, please cite @inproceedings{richard2021mesht

Meta Research 221 Jan 06, 2023
The spiritual successor to knockknock for PyTorch Lightning, get notified when your training ends

Who's there? The spiritual successor to knockknock for PyTorch Lightning, to get a notification when your training is complete or when it crashes duri

twsl 70 Oct 06, 2022
Analysis of Smiles through reservoir sampling & RDkit

Analysis of Smiles through reservoir sampling and machine learning (under development). This is a simple project that includes two Jupyter files for t

Aurimas A. Nausėdas 6 Aug 30, 2022
Pytorch implementation of MixNMatch

MixNMatch: Multifactor Disentanglement and Encoding for Conditional Image Generation [Paper] Yuheng Li, Krishna Kumar Singh, Utkarsh Ojha, Yong Jae Le

910 Dec 30, 2022
An official TensorFlow implementation of “CLCC: Contrastive Learning for Color Constancy” accepted at CVPR 2021.

CLCC: Contrastive Learning for Color Constancy (CVPR 2021) Yi-Chen Lo*, Chia-Che Chang*, Hsuan-Chao Chiu, Yu-Hao Huang, Chia-Ping Chen, Yu-Lin Chang,

Yi-Chen (Howard) Lo 58 Dec 17, 2022
Official Pytorch Implementation of Unsupervised Image Denoising with Frequency Domain Knowledge

Unsupervised Image Denoising with Frequency Domain Knowledge (BMVC 2021 Oral) : Official Project Page This repository provides the official PyTorch im

Donggon Jang 12 Sep 26, 2022
PyTorch implementation of the paper: "Preference-Adaptive Meta-Learning for Cold-Start Recommendation", IJCAI, 2021.

PAML PyTorch implementation of the paper: "Preference-Adaptive Meta-Learning for Cold-Start Recommendation", IJCAI, 2021. (Continuously updating ) Int

15 Nov 18, 2022
PyTorch code for the paper "FIERY: Future Instance Segmentation in Bird's-Eye view from Surround Monocular Cameras"

FIERY This is the PyTorch implementation for inference and training of the future prediction bird's-eye view network as described in: FIERY: Future In

Wayve 406 Dec 24, 2022
Deep Image Matting implementation in PyTorch

Deep Image Matting Deep Image Matting paper implementation in PyTorch. Differences "fc6" is dropped. Indices pooling. "fc6" is clumpy, over 100 millio

Yang Liu 724 Dec 27, 2022
PyTorch implementation of MuseMorphose, a Transformer-based model for music style transfer.

MuseMorphose This repository contains the official implementation of the following paper: Shih-Lun Wu, Yi-Hsuan Yang MuseMorphose: Full-Song and Fine-

Yating Music, Taiwan AI Labs 142 Jan 08, 2023
PyTorch implementation of InstaGAN: Instance-aware Image-to-Image Translation

InstaGAN: Instance-aware Image-to-Image Translation Warning: This repo contains a model which has potential ethical concerns. Remark that the task of

Sangwoo Mo 827 Dec 29, 2022