Official implementation of NeurIPS'2021 paper TransformerFusion

Last update: Dec 25, 2022

Related tags

Overview

TransformerFusion: Monocular RGB Scene Reconstruction using Transformers

Project Page | Paper | Video

TransformerFusion: Monocular RGB Scene Reconstruction using Transformers
Aljaz Bozic, Pablo Palafox, Justus Thies, Angela Dai, Matthias Niessner
NeurIPS 2021

TODOs

Evaluation code and metrics (with ground truth data)
Model code (with pretrained checkpoint)
Test-time reconstruction code
Training (and evaluation) data preparation scripts

How to install the framework

Clone the repository with submodules:

git clone --recurse-submodules https://github.com/AljazBozic/TransformerFusion.git

Create Conda environment:

conda env create -f environment.yml

Compile local C++/CUDA dependencies:

conda activate tf
cd csrc
python setup.py install

Evaluate the reconstructions

We evaluate method performance on the test scenes of ScanNet dataset.

We compare scene reconstructions to the ground truth meshes, obtained with fusion of RGB-D data. Since the ground truth meshes are not complete, we additionally compute occlusion masks of RGB-D scans, to not penalize the reconstructions that are more complete than the ground truth meshes.

You can download both ground truth meshes and occlusion masks here. To evaluate the reconstructions, you need to place them into data/reconstructions, and extract the ground truth data to data/groundtruth. The reconstructions are expected to be named as ScanNet test scenes, e.g. scene0733_00.ply. The following script computes evaluation metrics over all provided scene meshes:

conda activate tf
python src/evaluation/eval.py

Citation

If you find our work useful in your research, please consider citing:

@article{
bozic2021transformerfusion,
title={TransformerFusion: Monocular RGB Scene Reconstruction using Transformers},
author={Bozic, Aljaz and Palafox, Pablo and Thies, Justus and Dai, Angela and Niessner, Matthias},
journal={Proc. Neural Information Processing Systems (NeurIPS)},
year={2021}}

Related work

Some other related work on monocular RGB reconstruction of indoor scenes:

License

The code from this repository is released under the MIT license.

Official implementation of NeurIPS'2021 paper TransformerFusion

Related tags

Overview

TransformerFusion: Monocular RGB Scene Reconstruction using Transformers

Project Page | Paper | Video

TODOs

How to install the framework

Evaluate the reconstructions

Citation

Related work

License

Owner

Aljaz Bozic

Code for the CVPR 2021 paper: Understanding Failures of Deep Networks via Robust Feature Extraction

Project looking into use of autoencoder for semi-supervised learning and comparing data requirements compared to supervised learning.

Piotr - IoT firmware emulation instrumentation for training and research

Official Code for "Constrained Mean Shift Using Distant Yet Related Neighbors for Representation Learning"

Catch-all collection of generative art made using processing

A setup script to generate ITK Python Wheels

Dynamics-aware Adversarial Attack of 3D Sparse Convolution Network

Implementation of SegNet: A Deep Convolutional Encoder-Decoder Architecture for Semantic Pixel-Wise Labelling

An implementation of RetinaNet in PyTorch.

dualPC.R contains the R code for the main functions.

Kaggle Feedback Prize - Evaluating Student Writing 15th solution

Membership Inference Attack against Graph Neural Networks

Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-based network. Including examples for DETR, VQA.

The official implementation of the research paper "DAG Amendment for Inverse Control of Parametric Shapes"

ICRA 2021 "Towards Precise and Efficient Image Guided Depth Completion"

Code for the paper titled "Generalized Depthwise-Separable Convolutions for Adversarially Robust and Efficient Neural Networks" (NeurIPS 2021 Spotlight).

this is a lite easy to use virtual keyboard project for anyone to use

Efficient semidefinite bounds for multi-label discrete graphical models.

No-Reference Image Quality Assessment via Transformers, Relative Ranking, and Self-Consistency

ZSL-KG is a general-purpose zero-shot learning framework with a novel transformer graph convolutional network (TrGCN) to learn class representation from common sense knowledge graphs.