Implementation of CVPR'21: RfD-Net: Point Scene Understanding by Semantic Instance Reconstruction

Overview

RfD-Net [Project Page] [Paper] [Video]

RfD-Net: Point Scene Understanding by Semantic Instance Reconstruction
Yinyu Nie, Ji Hou, Xiaoguang Han, Matthias Nießner
In CVPR, 2021.

points.png pred.png

From an incomplete point cloud of a 3D scene (left), our method learns to jointly understand the 3D objects and reconstruct instance meshes as the output (right).


Install

  1. This implementation uses Python 3.6, Pytorch1.7.1, cudatoolkit 11.0. We recommend to use conda to deploy the environment.

    • Install with conda:
    conda env create -f environment.yml
    conda activate rfdnet
    
    • Install with pip:
    pip install -r requirements.txt
    
  2. Next, compile the external libraries by

    python setup.py build_ext --inplace
    
  3. Install PointNet++ by

    cd external/pointnet2_ops_lib
    pip install .
    

Demo

The pretrained model can be downloaded here. Put the pretrained model in the directory as below

out/pretrained_models/pretrained_weight.pth

A demo is illustrated below to see how our method works.

cd RfDNet
python main.py --config configs/config_files/ISCNet_test.yaml --mode demo --demo_path demo/inputs/scene0549_00.off

VTK is used here to visualize the 3D scenes. The outputs will be saved under 'demo/outputs'. You can also play with your toy with this script.

If everything goes smooth, there will be a GUI window popped up and you can interact with the scene as below. screenshot_demo.png

If you run it on machines without X display server, you can use the offscreen mode by setting offline=True in demo.py. The rendered image will be saved in demo/outputs/some_scene_id/pred.png.


Prepare Data

In our paper, we use the input point cloud from the ScanNet dataset, and the annotated instance CAD models from the Scan2CAD dataset. Scan2CAD aligns the object CAD models from ShapeNetCore.v2 to each object in ScanNet, and we use these aligned CAD models as the ground-truth.

Preprocess ScanNet and Scan2CAD data

You can either directly download the processed samples [link] to the directory below (recommended)

datasets/scannet/processed_data/

or

  1. Ask for the ScanNet dataset and download it to
    datasets/scannet/scans
    
  2. Ask for the Scan2CAD dataset and download it to
    datasets/scannet/scan2cad_download_link
    
  3. Preprocess the ScanNet and Scan2CAD dataset for training by
    cd RfDNet
    python utils/scannet/gen_scannet_w_orientation.py
    
Preprocess ShapeNet data

You can either directly download the processed data [link] and extract them to datasets/ShapeNetv2_data/ as below

datasets/ShapeNetv2_data/point
datasets/ShapeNetv2_data/pointcloud
datasets/ShapeNetv2_data/voxel
datasets/ShapeNetv2_data/watertight_scaled_simplified

or

  1. Download ShapeNetCore.v2 to the path below

    datasets/ShapeNetCore.v2
    
  2. Process ShapeNet models into watertight meshes by

    python utils/shapenet/1_fuse_shapenetv2.py
    
  3. Sample points on ShapeNet models for training (similar to Occupancy Networks).

    python utils/shapenet/2_sample_mesh.py --resize --packbits --float16
    
  4. There are usually 100K+ points per object mesh. We simplify them to speed up our testing and visualization by

    python utils/shapenet/3_simplify_fusion.py --in_dir datasets/ShapeNetv2_data/watertight_scaled --out_dir datasets/ShapeNetv2_data/watertight_scaled_simplified
    
Verify preprocessed data

After preprocessed the data, you can run the visualization script below to check if they are generated correctly.

  • Visualize ScanNet+Scan2CAD+ShapeNet samples by

    python utils/scannet/visualization/vis_gt.py
    

    A VTK window will be popped up like below.

    verify.png

Training, Generating and Evaluation

We use the configuration file (see 'configs/config_files/****.yaml') to fully control the training/testing/generating process. You can check a template at configs/config_files/ISCNet.yaml.

Training

We firstly pretrain our detection module and completion module followed by a joint refining. You can follow the process below.

  1. Pretrain the detection module by

    python main.py --config configs/config_files/ISCNet_detection.yaml --mode train
    

    It will save the detection module weight at out/iscnet/a_folder_with_detection_module/model_best.pth

  2. Copy the weight path of detection module (see 1.) into configs/config_files/ISCNet_completion.yaml as

    weight: ['out/iscnet/a_folder_with_detection_module/model_best.pth']
    

    Then pretrain the completion module by

    python main.py --config configs/config_files/ISCNet_completion.yaml --mode train
    

    It will save the completion module weight at out/iscnet/a_folder_with_completion_module/model_best.pth

  3. Copy the weight path of completion module (see 2.) into configs/config_files/ISCNet.yaml as

    weight: ['out/iscnet/a_folder_with_completion_module/model_best.pth']
    

    Then jointly finetune RfD-Net by

    python main.py --config configs/config_files/ISCNet.yaml --mode train
    

    It will save the trained model weight at out/iscnet/a_folder_with_RfD-Net/model_best.pth

Generating

Copy the weight path of RfD-Net (see 3. above) into configs/config_files/ISCNet_test.yaml as

weight: ['out/iscnet/a_folder_with_RfD-Net/model_best.pth']

Run below to output all scenes in the test set.

python main.py --config configs/config_files/ISCNet_test.yaml --mode test

The 3D scenes for visualization are saved in the folder of out/iscnet/a_folder_with_generated_scenes/visualization. You can visualize a triplet of (input, pred, gt) following a demo below

python utils/scannet/visualization/vis_for_comparison.py 

If everything goes smooth, there will be three windows (corresponding to input, pred, gt) popped up by sequence as

Input Prediction Ground-truth

Evaluation

You can choose each of the following ways for evaluation.

  1. You can export all scenes above to calculate the evaluation metrics with any external library (for researchers who would like to unify the benchmark). Lower the dump_threshold in ISCNet_test.yaml in generation to enable more object proposals for mAP calculation (e.g. dump_threshold=0.05).

  2. In our evaluation, we voxelize the 3D scenes to keep consistent resolution with the baseline methods. To enable this,

    1. make sure the executable binvox are downloaded and configured as an experiment variable (e.g. export its path in ~/.bashrc for Ubuntu). It will be deployed by Trimesh.

    2. Change the ISCNet_test.yaml as below for evaluation.

       test:
         evaluate_mesh_mAP: True
       generation:
         dump_results: False
    

    Run below to report the evaluation results.

    python main.py --config configs/config_files/ISCNet_test.yaml --mode test
    

    The log file will saved in out/iscnet/a_folder_named_with_script_time/log.txt


Differences to the paper

  1. The original paper was implemented with Pytorch 1.1.0, and we reconfigure our code to fit with Pytorch 1.7.1.
  2. A post processing step to align the reconstructed shapes to the input scan is supported. We have verified that it can improve the evaluation performance by a small margin. You can switch on/off it following demo.py.
  3. A different learning rate scheduler is adopted. The learning rate decreases to 0.1x if there is no gain within 20 steps, which is much more efficient.

Citation

If you find our work helpful, please consider citing

@inproceedings{Nie_2021_CVPR,
    title={RfD-Net: Point Scene Understanding by Semantic Instance Reconstruction},
    author={Nie, Yinyu and Hou, Ji and Han, Xiaoguang and Nie{\ss}ner, Matthias},
    booktitle={Proc. Computer Vision and Pattern Recognition (CVPR), IEEE},
    year={2021}
}


License

RfD-Net is relased under the MIT License. See the LICENSE file for more details.

Owner
Yinyu Nie
currently a Ph.D. student at NCCA, Bournemouth University.
Yinyu Nie
Implementing SYNTHESIZER: Rethinking Self-Attention in Transformer Models using Pytorch

Implementing SYNTHESIZER: Rethinking Self-Attention in Transformer Models using Pytorch Reference Paper URL Author: Yi Tay, Dara Bahri, Donald Metzler

Myeongjun Kim 66 Nov 30, 2022
Official implementation of the paper Label-Efficient Semantic Segmentation with Diffusion Models

Label-Efficient Semantic Segmentation with Diffusion Models Official implementation of the paper Label-Efficient Semantic Segmentation with Diffusion

Yandex Research 355 Jan 06, 2023
Experiments for Fake News explainability project

fake-news-explainability Experiments for fake news explainability project This repository only contains the notebooks used to train the models and eva

Lorenzo Flores (Lj) 1 Dec 03, 2022
PyTorch implementation of Interpretable Explanations of Black Boxes by Meaningful Perturbation

PyTorch implementation of Interpretable Explanations of Black Boxes by Meaningful Perturbation The paper: https://arxiv.org/abs/1704.03296 What makes

Jacob Gildenblat 322 Dec 17, 2022
Official Implementation for Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation

Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation We present a generic image-to-image translation framework, pixel2style2pixel (pSp

2.8k Dec 30, 2022
OpenMMLab Computer Vision Foundation

English | 简体中文 Introduction MMCV is a foundational library for computer vision research and supports many research projects as below: MMCV: OpenMMLab

OpenMMLab 4.6k Jan 09, 2023
Usable Implementation of "Bootstrap Your Own Latent" self-supervised learning, from Deepmind, in Pytorch

Bootstrap Your Own Latent (BYOL), in Pytorch Practical implementation of an astoundingly simple method for self-supervised learning that achieves a ne

Phil Wang 1.4k Dec 29, 2022
Official repository of "BasicVSR++: Improving Video Super-Resolution with Enhanced Propagation and Alignment"

BasicVSR_PlusPlus (CVPR 2022) [Paper] [Project Page] [Code] This is the official repository for BasicVSR++. Please feel free to raise issue related to

Kelvin C.K. Chan 227 Jan 01, 2023
Given a 2D triangle mesh, we could randomly generate cloud points that fill in the triangle mesh

generate_cloud_points Given a 2D triangle mesh, we could randomly generate cloud points that fill in the triangle mesh. Run python disp_mesh.py Or you

Peng Yu 2 Dec 24, 2021
Learning Neural Painters Fast! using PyTorch and Fast.ai

The Joy of Neural Painting Learning Neural Painters Fast! using PyTorch and Fast.ai Blogpost with more details: The Joy of Neural Painting The impleme

Libre AI 72 Nov 10, 2022
An Open-Source Package for Information Retrieval.

OpenMatch An Open-Source Package for Information Retrieval. 😃 What's New Top Spot on TREC-COVID Challenge (May 2020, Round2) The twin goals of the ch

THUNLP 439 Dec 27, 2022
The source code for the Cutoff data augmentation approach proposed in this paper: "A Simple but Tough-to-Beat Data Augmentation Approach for Natural Language Understanding and Generation".

Cutoff: A Simple Data Augmentation Approach for Natural Language This repository contains source code necessary to reproduce the results presented in

Dinghan Shen 49 Dec 22, 2022
nnFormer: Interleaved Transformer for Volumetric Segmentation Code for paper "nnFormer: Interleaved Transformer for Volumetric Segmentation "

nnFormer: Interleaved Transformer for Volumetric Segmentation Code for paper "nnFormer: Interleaved Transformer for Volumetric Segmentation ". Please

jsguo 610 Dec 28, 2022
SimBERT升级版(SimBERTv2)!

RoFormer-Sim RoFormer-Sim,又称SimBERTv2,是我们之前发布的SimBERT模型的升级版。 介绍 https://kexue.fm/archives/8454 训练 tensorflow 1.14 + keras 2.3.1 + bert4keras 0.10.6 下载

318 Dec 31, 2022
A hifiasm fork for metagenome assembly using Hifi reads.

hifiasm_meta - de novo metagenome assembler, based on hifiasm, a haplotype-resolved de novo assembler for PacBio Hifi reads.

44 Jul 10, 2022
Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. It can use GPUs and perform efficient symbolic differentiation.

============================================================================================================ `MILA will stop developing Theano https:

9.6k Jan 06, 2023
Code for ACL 21: Generating Query Focused Summaries from Query-Free Resources

marge This repository releases the code for Generating Query Focused Summaries from Query-Free Resources. Please cite the following paper [bib] if you

Yumo Xu 28 Nov 10, 2022
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows" on Semantic Segmentation.

Swin Transformer for Semantic Segmentation of satellite images This repo contains the supported code and configuration files to reproduce semantic seg

23 Oct 10, 2022
Continuous Conditional Random Field Convolution for Point Cloud Segmentation

CRFConv This repository is the implementation of "Continuous Conditional Random Field Convolution for Point Cloud Segmentation" 1. Setup 1) Building c

Fei Yang 8 Dec 08, 2022
Data, notebooks, and articles associated with the RSNA AI Deep Learning Lab at RSNA 2021

RSNA AI Deep Learning Lab 2021 Intro Welcome Deep Learners! This document provides all the information you need to participate in the RSNA AI Deep Lea

RSNA 65 Dec 16, 2022