Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks

Related tags

Deep LearningSSTNet
Overview

SSTNet

PWC PWC

overview Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks(ICCV2021) by Zhihao Liang, Zhihao Li, Songcen Xu, Mingkui Tan, Kui Jia*. (*) Corresponding author. [arxiv]

Introduction

Instance segmentation in 3D scenes is fundamental in many applications of scene understanding. It is yet challenging due to the compound factors of data irregularity and uncertainty in the numbers of instances. State-of-the-art methods largely rely on a general pipeline that first learns point-wise features discriminative at semantic and instance levels, followed by a separate step of point grouping for proposing object instances. While promising, they have the shortcomings that (1) the second step is not supervised by the main objective of instance segmentation, and (2) their point-wise feature learning and grouping are less effective to deal with data irregularities, possibly resulting in fragmented segmentations. To address these issues, we propose in this work an end-to-end solution of Semantic Superpoint Tree Network (SSTNet) for proposing object instances from scene points. Key in SSTNet is an intermediate, semantic superpoint tree (SST), which is constructed based on the learned semantic features of superpoints, and which will be traversed and split at intermediate tree nodes for proposals of object instances. We also design in SSTNet a refinement module, termed CliqueNet, to prune superpoints that may be wrongly grouped into instance proposals.

Installation

Requirements

  • Python 3.8.5
  • Pytorch 1.7.1
  • torchvision 0.8.2
  • CUDA 11.1

then install the requirements:

pip install -r requirements.txt

SparseConv

For the SparseConv, please refer PointGroup's spconv to install.

Extension

This project is based on our Gorilla-Lab deep learning toolkit - gorilla-core and 3D toolkit gorilla-3d.

For gorilla-core, you can install it by running:

pip install gorilla-core==0.2.7.6

or building from source(recommend)

git clone https://github.com/Gorilla-Lab-SCUT/gorilla-core
cd gorilla-core
python setup.py install(develop)

For gorilla-3d, you should install it by building from source:

git clone https://github.com/Gorilla-Lab-SCUT/gorilla-3d
cd gorilla-3d
python setup.py develop

Tip: for high-version torch, the BuildExtension may fail by using ninja to build the compile system. If you meet this problem, you can change the BuildExtension in cmdclass={"build_ext": BuildExtension} as cmdclass={"build_ext": BuildExtension}.with_options(use_ninja=False)

Otherwise, this project also need other extension, we use the pointgroup_ops to realize voxelization and use the segmentator to generate superpoints for scannet scene. we use the htree to construct the Semantic Superpoint Tree and the hierarchical node-inheriting relations is realized based on the modified cluster.hierarchy.linkage function from scipy.

  • For pointgroup_ops, we modified the package from PointGroup to let its function calls get rid of the dependence on absolute paths. You can install it by running:
    conda install -c bioconda google-sparsehash 
    cd $PROJECT_ROOT$
    cd sstnet/lib/pointgroup_ops
    python setup.py develop
    Then, you can call the function like:
    import pointgroup_ops
    pointgroup_ops.voxelization
    >>> <function Voxelization.apply>
  • For htree, it can be seen as a supplement to the treelib python package, and I abstract the SST through both of them. You can install it by running:
    cd $PROJECT_ROOT$
    cd sstnet/lib/htree
    python setup.py install

    Tip: The interaction between this piece of code and treelib is a bit messy. I lack time to organize it, which may cause some difficulties for someone in understanding. I am sorry for this. At the same time, I also welcome people to improve it.

  • For cluster, it is originally a sub-module in scipy, the SST construction requires the cluster.hierarchy.linkage to be implemented. However, the origin implementation do not consider the sizes of clustering nodes (each superpoint contains different number of points). To this end, we modify this function and let it support the property mentioned above. So, for used, you can install it by running:
    cd $PROJECT_ROOT$
    cd sstnet/lib/cluster
    python setup.py install
  • For segmentator, please refer here to install. (We wrap the segmentator in ScanNet)

Data Preparation

Please refer to the README.md in data/scannetv2 to realize data preparation.

Training

CUDA_VISIBLE_DEVICES=0 python train.py --config config/default.yaml

You can start a tensorboard session by

tensorboard --logdir=./log --port=6666

Tip: For the directory of logging, please refer the implementation of function gorilla.collect_logger.

Inference and Evaluation

CUDA_VISIBLE_DEVICES=0 python test.py --config config/default.yaml --pretrain pretrain.pth --eval
  • --split is the evaluation split of dataset.
  • --save is the action to save instance segmentation results.
  • --eval is the action to evaluate the segmentation results.
  • --semantic is the action to evaluate semantic segmentation only (work on the --eval mode).
  • --log-file is to define the logging file to save evaluation result (default please to refer the gorilla.collect_logger).
  • --visual is the action to save visualization of instance segmentation. (It will be mentioned in the next partion.)

Results on ScanNet Benchmark

Rank 1st on the ScanNet benchmark benchmark

Pretrained

We provide a pretrained model trained on ScanNet(v2) dataset. [Google Drive] [Baidu Cloud] (提取码:f3az) Its performance on ScanNet(v2) validation set is 49.4/64.9/74.4 in terms of mAP/mAP50/mAP25.

Acknowledgement

This repo is built upon several repos, e.g., PointGroup, spconv and ScanNet.

Contact

If you have any questions or suggestions about this repo or paper, please feel free to contact me in issue or email ([email protected]).

TODO

  • Distributed training(not verification)
  • Batch inference
  • Multi-processing for getting superpoints

Citation

If you find this work useful in your research, please cite:

@misc{liang2021instance,
      title={Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks}, 
      author={Zhihao Liang and Zhihao Li and Songcen Xu and Mingkui Tan and Kui Jia},
      year={2021},
      eprint={2108.07478},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}
Owner
Research lab focusing on CV, ML, and AI
Patch-Based Deep Autoencoder for Point Cloud Geometry Compression

Patch-Based Deep Autoencoder for Point Cloud Geometry Compression Overview The ever-increasing 3D application makes the point cloud compression unprec

17 Dec 05, 2022
Deep Learning for Natural Language Processing SS 2021 (TU Darmstadt)

Deep Learning for Natural Language Processing SS 2021 (TU Darmstadt) Task Training huge unsupervised deep neural networks yields to strong progress in

Oliver Hahn 1 Jan 26, 2022
Air Quality Prediction Using LSTM

AirQualityPredictionUsingLSTM In this Repo, i present to you the winning solution of smart gujarat hackathon 2019 where the task was to predict the qu

Deepak Nandwani 2 Dec 13, 2022
a dnn ai project to classify which food people are eating on audio recordings

Deep Learning - EAT Challenge About This project is part of an AI challenge of the DeepLearning course 2021 at the University of Augsburg. The objecti

Marco Tröster 1 Oct 24, 2021
Official Pytorch implementation of "DivCo: Diverse Conditional Image Synthesis via Contrastive Generative Adversarial Network" (CVPR'21)

DivCo: Diverse Conditional Image Synthesis via Contrastive Generative Adversarial Network Pytorch implementation for our DivCo. We propose a simple ye

64 Nov 22, 2022
Generative Handwriting using LSTM Mixture Density Network with TensorFlow

Generative Handwriting Demo using TensorFlow An attempt to implement the random handwriting generation portion of Alex Graves' paper. See my blog post

hardmaru 686 Nov 24, 2022
Code for ICCV 2021 paper: ARAPReg: An As-Rigid-As Possible Regularization Loss for Learning Deformable Shape Generators..

ARAPReg Code for ICCV 2021 paper: ARAPReg: An As-Rigid-As Possible Regularization Loss for Learning Deformable Shape Generators.. Installation The cod

Bo Sun 132 Nov 28, 2022
Conditional Gradients For The Approximately Vanishing Ideal

Conditional Gradients For The Approximately Vanishing Ideal Code for the paper: Wirth, E., and Pokutta, S. (2022). Conditional Gradients for the Appro

IOL Lab @ ZIB 0 May 25, 2022
Vision Transformer and MLP-Mixer Architectures

Vision Transformer and MLP-Mixer Architectures Update (2.7.2021): Added the "When Vision Transformers Outperform ResNets..." paper, and SAM (Sharpness

Google Research 6.4k Jan 04, 2023
Code for MarioNette: Self-Supervised Sprite Learning, in NeurIPS 2021

MarioNette | Webpage | Paper | Video MarioNette: Self-Supervised Sprite Learning Dmitriy Smirnov, Michaël Gharbi, Matthew Fisher, Vitor Guizilini, Ale

Dima Smirnov 28 Nov 18, 2022
OpenIPDM is a MATLAB open-source platform that stands for infrastructures probabilistic deterioration model

Open-Source Toolbox for Infrastructures Probabilistic Deterioration Modelling OpenIPDM is a MATLAB open-source platform that stands for infrastructure

CIVML 0 Jan 20, 2022
CSAC - Collaborative Semantic Aggregation and Calibration for Separated Domain Generalization

CSAC Introduction This repository contains the implementation code for paper: Co

ScottYuan 5 Jul 22, 2022
NAS Benchmark in "Prioritized Architecture Sampling with Monto-Carlo Tree Search", CVPR2021

NAS-Bench-Macro This repository includes the benchmark and code for NAS-Bench-Macro in paper "Prioritized Architecture Sampling with Monto-Carlo Tree

35 Jan 03, 2023
Image-Scaling Attacks and Defenses

Image-Scaling Attacks & Defenses This repository belongs to our publication: Erwin Quiring, David Klein, Daniel Arp, Martin Johns and Konrad Rieck. Ad

Erwin Quiring 163 Nov 21, 2022
Technical Indicators implemented in Python only using Numpy-Pandas as Magic - Very Very Fast! Very tiny! Stock Market Financial Technical Analysis Python library . Quant Trading automation or cryptocoin exchange

MyTT Technical Indicators implemented in Python only using Numpy-Pandas as Magic - Very Very Fast! to Stock Market Financial Technical Analysis Python

dev 34 Dec 27, 2022
CLOOB: Modern Hopfield Networks with InfoLOOB Outperform CLIP

CLOOB: Modern Hopfield Networks with InfoLOOB Outperform CLIP Andreas Fürst* 1, Elisabeth Rumetshofer* 1, Viet Tran1, Hubert Ramsauer1, Fei Tang3, Joh

Institute for Machine Learning, Johannes Kepler University Linz 133 Jan 04, 2023
The GitHub repository for the paper: “Time Series is a Special Sequence: Forecasting with Sample Convolution and Interaction“.

SCINet This is the original PyTorch implementation of the following work: Time Series is a Special Sequence: Forecasting with Sample Convolution and I

386 Jan 01, 2023
Official PyTorch implementation of "Contrastive Learning from Extremely Augmented Skeleton Sequences for Self-supervised Action Recognition" in AAAI2022.

AimCLR This is an official PyTorch implementation of "Contrastive Learning from Extremely Augmented Skeleton Sequences for Self-supervised Action Reco

Gty 44 Dec 17, 2022
“Data Augmentation for Cross-Domain Named Entity Recognition” (EMNLP 2021)

Data Augmentation for Cross-Domain Named Entity Recognition Authors: Shuguang Chen, Gustavo Aguilar, Leonardo Neves and Thamar Solorio This repository

<a href=[email protected]"> 18 Sep 10, 2022
Identify the emotion of multiple speakers in an Audio Segment

MevonAI - Speech Emotion Recognition Identify the emotion of multiple speakers in a Audio Segment Report Bug · Request Feature Try the Demo Here Table

Suyash More 110 Dec 03, 2022