Hierarchical Aggregation for 3D Instance Segmentation (ICCV 2021)

Related tags

Deep LearningHAIS
Overview

HAIS

PWC PWC

Hierarchical Aggregation for 3D Instance Segmentation (ICCV 2021)

by Shaoyu Chen, Jiemin Fang, Qian Zhang, Wenyu Liu, Xinggang Wang*. (*) Corresponding author. [arXiv]


Introduction

  • HAIS is an efficient and concise bottom-up framework (NMS-free and single-forward) for point cloud instance segmentation. It adopts the hierarchical aggregation (point aggregation and set aggregation) to generate instances and the intra-instance prediction for outlier filtering and mask quality scoring.

Framework

Learderboard

  • High speed. Thanks to the NMS-free and single-forward inference design, HAIS achieves the best inference speed among all existing methods. HAIS only takes 206 ms on RTX 3090 and 339 ms on TITAN X.
Method Per-frame latency on TITAN X
ASIS 181913 ms
SGPN 158439 ms
3D-SIS 124490 ms
GSPN 12702 ms
3D-BoNet 9202 ms
GICN 8615 ms
OccuSeg 1904 ms
PointGroup 452 ms
HAIS 339 ms

[ICCV21 presentation]

Update

2021.9.30:

  • Code is released.
  • With better CUDA optimization, HAIS now only takes 339 ms on TITAN X, much better than the latency reported in the paper (410 ms on TITAN X).

Installation

1) Environment

  • Python 3.x
  • Pytorch 1.1 or higher
  • CUDA 9.2 or higher
  • gcc-5.4 or higher

Create a conda virtual environment and activate it.

conda create -n hais python=3.7
conda activate hais

2) Clone the repository.

git clone https://github.com/hustvl/HAIS.git --recursive

3) Install the requirements.

cd HAIS
pip install -r requirements.txt
conda install -c bioconda google-sparsehash 

4) Install spconv

  • Verify the version of spconv.

    spconv 1.0, compatible with CUDA < 11 and pytorch < 1.5, is already recursively cloned in HAIS/lib/spconv in step 2) by default.

    For higher version CUDA and pytorch, spconv 1.2 is suggested. Replace HAIS/lib/spconv with this fork of spconv.

git clone https://github.com/outsidercsy/spconv.git --recursive
  Note:  In the provided spconv 1.0 and 1.2, spconv\spconv\functional.py is modified to make grad_output contiguous. Make sure you use the modified spconv but not the original one. Or there would be some bugs of optimization.
  • Install the dependent libraries.
conda install libboost
conda install -c daleydeng gcc-5 # (optional, install gcc-5.4 in conda env)
  • Compile the spconv library.
cd HAIS/lib/spconv
python setup.py bdist_wheel
  • Intall the generated .whl file.
cd HAIS/lib/spconv/dist
pip install {wheel_file_name}.whl

5) Compile the external C++ and CUDA ops.

cd HAIS/lib/hais_ops
export CPLUS_INCLUDE_PATH={conda_env_path}/hais/include:$CPLUS_INCLUDE_PATH
python setup.py build_ext develop

{conda_env_path} is the location of the created conda environment, e.g., /anaconda3/envs.

Data Preparation

1) Download the ScanNet v2 dataset.

2) Put the data in the corresponding folders.

  • Copy the files [scene_id]_vh_clean_2.ply, [scene_id]_vh_clean_2.labels.ply, [scene_id]_vh_clean_2.0.010000.segs.json and [scene_id].aggregation.json into the dataset/scannetv2/train and dataset/scannetv2/val folders according to the ScanNet v2 train/val split.

  • Copy the files [scene_id]_vh_clean_2.ply into the dataset/scannetv2/test folder according to the ScanNet v2 test split.

  • Put the file scannetv2-labels.combined.tsv in the dataset/scannetv2 folder.

The dataset files are organized as follows.

HAIS
├── dataset
│   ├── scannetv2
│   │   ├── train
│   │   │   ├── [scene_id]_vh_clean_2.ply & [scene_id]_vh_clean_2.labels.ply & [scene_id]_vh_clean_2.0.010000.segs.json & [scene_id].aggregation.json
│   │   ├── val
│   │   │   ├── [scene_id]_vh_clean_2.ply & [scene_id]_vh_clean_2.labels.ply & [scene_id]_vh_clean_2.0.010000.segs.json & [scene_id].aggregation.json
│   │   ├── test
│   │   │   ├── [scene_id]_vh_clean_2.ply 
│   │   ├── scannetv2-labels.combined.tsv

3) Generate input files [scene_id]_inst_nostuff.pth for instance segmentation.

cd HAIS/dataset/scannetv2
python prepare_data_inst.py --data_split train
python prepare_data_inst.py --data_split val
python prepare_data_inst.py --data_split test

Training

CUDA_VISIBLE_DEVICES=0 python train.py --config config/hais_run1_scannet.yaml 

Inference

1) To evaluate on validation set,

  • prepare the .txt instance ground-truth files as the following.
cd dataset/scannetv2
python prepare_data_inst_gttxt.py
  • set split and eval in the config file as val and True.

  • Run the inference and evaluation code.

CUDA_VISIBLE_DEVICES=0 python test.py --config config/hais_run1_scannet.yaml --pretrain $PATH_TO_PRETRAIN_MODEL$

Pretrained model: Google Drive / Baidu Cloud [code: sh4t]. mAP/mAP50/mAP25 is 44.1/64.4/75.7.

2) To evaluate on test set,

  • Set (split, eval, save_instance) as (test, False, True).
  • Run the inference code. Prediction results are saved in HAIS/exp by default.
CUDA_VISIBLE_DEVICES=0 python test.py --config config/hais_run1_scannet.yaml --pretrain $PATH_TO_PRETRAIN_MODEL$

Visualization

We provide visualization tools based on Open3D (tested on Open3D 0.8.0).

pip install open3D==0.8.0
python visualize_open3d.py --data_path {} --prediction_path {} --data_split {} --room_name {} --task {}

Please refer to visualize_open3d.py for more details.

Acknowledgement

The code is based on PointGroup and spconv.

Contact

If you have any questions or suggestions about this repo, please feel free to contact me ([email protected]).

Citation

@InProceedings{Chen_2021_ICCV,
    author    = {Chen, Shaoyu and Fang, Jiemin and Zhang, Qian and Liu, Wenyu and Wang, Xinggang},
    title     = {Hierarchical Aggregation for 3D Instance Segmentation},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2021},
    pages     = {15467-15476}
}
Owner
Hust Visual Learning Team
Hust Visual Learning Team belongs to the Artificial Intelligence Research Institute in the School of EIC in HUST
Hust Visual Learning Team
[ICCV'21] Neural Radiance Flow for 4D View Synthesis and Video Processing

NeRFlow [ICCV'21] Neural Radiance Flow for 4D View Synthesis and Video Processing Datasets The pouring dataset used for experiments can be download he

44 Dec 20, 2022
Lightweight Cuda Renderer with Python Wrapper.

pyRender Lightweight Cuda Renderer with Python Wrapper. Compile Change compile.sh line 5 to the glm library include path. This library can be download

Jingwei Huang 53 Dec 02, 2022
🕹️ Official Implementation of Conditional Motion In-betweening (CMIB) 🏃

Conditional Motion In-Betweening (CMIB) Official implementation of paper: Conditional Motion In-betweeening. Paper(arXiv) | Project Page | YouTube in-

Jihoon Kim 81 Dec 22, 2022
Final term project for Bayesian Machine Learning Lecture (XAI-623)

Mixquality_AL Final Term Project For Bayesian Machine Learning Lecture (XAI-623) Youtube Link The presentation is given in YoutubeLink Problem Formula

JeongEun Park 3 Jan 18, 2022
BitPack is a practical tool to efficiently save ultra-low precision/mixed-precision quantized models.

BitPack is a practical tool that can efficiently save quantized neural network models with mixed bitwidth.

Zhen Dong 36 Dec 02, 2022
Prevent `CUDA error: out of memory` in just 1 line of code.

🐨 Koila Koila solves CUDA error: out of memory error painlessly. Fix it with just one line of code, and forget it. 🚀 Features 🙅 Prevents CUDA error

RenChu Wang 1.7k Jan 02, 2023
Diagnostic tests for linguistic capacities in language models

LM diagnostics This repository contains the diagnostic datasets and experimental code for What BERT is not: Lessons from a new suite of psycholinguist

61 Jan 02, 2023
Code for our method RePRI for Few-Shot Segmentation. Paper at http://arxiv.org/abs/2012.06166

Region Proportion Regularized Inference (RePRI) for Few-Shot Segmentation In this repo, we provide the code for our paper : "Few-Shot Segmentation Wit

Malik Boudiaf 138 Dec 12, 2022
Facial recognition project

Facial recognition project documentation Project introduction This project is developed by linuxu. It is a face model recognition project developed ba

Jefferson 2 Dec 04, 2022
Strongly local p-norm-cut algorithms for semi-supervised learning and local graph clustering

Strongly local p-norm-cut algorithms for semi-supervised learning and local graph clustering

Meng Liu 2 Jul 19, 2022
SciFive: a text-text transformer model for biomedical literature

SciFive SciFive provided a Text-Text framework for biomedical language and natural language in NLP. Under the T5's framework and desrbibed in the pape

Long Phan 54 Dec 24, 2022
Project NII pytorch scripts

project-NII-pytorch-scripts By Xin Wang, National Institute of Informatics, since 2021 I am a new pytorch user. If you have any suggestions or questio

Yamagishi and Echizen Laboratories, National Institute of Informatics 184 Dec 23, 2022
A resource for learning about ML, DL, PyTorch and TensorFlow. Feedback always appreciated :)

A resource for learning about ML, DL, PyTorch and TensorFlow. Feedback always appreciated :)

Aladdin Persson 4.7k Jan 08, 2023
An pytorch implementation of Masked Autoencoders Are Scalable Vision Learners

An pytorch implementation of Masked Autoencoders Are Scalable Vision Learners This is a coarse version for MAE, only make the pretrain model, the fine

FlyEgle 214 Dec 29, 2022
Orchestrating Distributed Materials Acceleration Platform Tutorial

Orchestrating Distributed Materials Acceleration Platform Tutorial This tutorial for orchestrating distributed materials acceleration platform was pre

BIG-MAP 1 Jan 25, 2022
Does Oversizing Improve Prosumer Profitability in a Flexibility Market? - A Sensitivity Analysis using PV-battery System

Does Oversizing Improve Prosumer Profitability in a Flexibility Market? - A Sensitivity Analysis using PV-battery System The possibilities to involve

Babu Kumaran Nalini 0 Nov 19, 2021
DeepGNN is a framework for training machine learning models on large scale graph data.

DeepGNN Overview DeepGNN is a framework for training machine learning models on large scale graph data. DeepGNN contains all the necessary features in

Microsoft 45 Jan 01, 2023
Optimized primitives for collective multi-GPU communication

NCCL Optimized primitives for inter-GPU communication. Introduction NCCL (pronounced "Nickel") is a stand-alone library of standard communication rout

NVIDIA Corporation 2k Jan 09, 2023
Repository containing the PhD Thesis "Formal Verification of Deep Reinforcement Learning Agents"

Getting Started This repository contains the code used for the following publications: Probabilistic Guarantees for Safe Deep Reinforcement Learning (

Edoardo Bacci 5 Aug 31, 2022
PyTorch implementation of SCAFFOLD (Stochastic Controlled Averaging for Federated Learning, ICML 2020).

Scaffold-Federated-Learning PyTorch implementation of SCAFFOLD (Stochastic Controlled Averaging for Federated Learning, ICML 2020). Environment numpy=

KI 30 Dec 29, 2022