(CVPR 2022 - oral) Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry

Overview

Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry

Official implementation of the paper

Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry

CVPR 2022 [oral]

Gwangbin Bae, Ignas Budvytis, and Roberto Cipolla

[arXiv]

We present MaGNet (Monocular and Geometric Network), a novel framework for fusing single-view depth probability with multi-view geometry, to improve the accuracy, robustness and efficiency of multi-view depth estimation. For each frame, MaGNet estimates a single-view depth probability distribution, parameterized as a pixel-wise Gaussian. The distribution estimated for the reference frame is then used to sample per-pixel depth candidates. Such probabilistic sampling enables the network to achieve higher accuracy while evaluating fewer depth candidates. We also propose depth consistency weighting for the multi-view matching score, to ensure that the multi-view depth is consistent with the single-view predictions. The proposed method achieves state-of-the-art performance on ScanNet, 7-Scenes and KITTI. Qualitative evaluation demonstrates that our method is more robust against challenging artifacts such as texture-less/reflective surfaces and moving objects.

Datasets

We evaluated MaGNet on ScanNet, 7-Scenes and KITTI

ScanNet

  • In order to download ScanNet, you should submit an agreement to the Terms of Use. Please follow the instructions in this link.
  • The folder should be organized as

/path/to/ScanNet
/path/to/ScanNet/scans
/path/to/ScanNet/scans/scene0000_00 ...
/path/to/ScanNet/scans_test
/path/to/ScanNet/scans_test/scene0707_00 ...

7-Scenes

  • Download all seven scenes (Chess, Fire, Heads, Office, Pumpkin, RedKitchen, Stairs) from this link.
  • The folder should be organized as:

/path/to/SevenScenes
/path/to/SevenScenes/chess ...

KITTI

  • Download raw data from this link.
  • Download depth maps from this link
  • The folder should be organized as:

/path/to/KITTI
/path/to/KITTI/rawdata
/path/to/KITTI/rawdata/2011_09_26 ...
/path/to/KITTI/train
/path/to/KITTI/train/2011_09_26_drive_0001_sync ...
/path/to/KITTI/val
/path/to/KITTI/val/2011_09_26_drive_0002_sync ...

Download model weights

Download model weights by

python ckpts/download.py

If some files are not downloaded properly, download them manually from this link and place the files under ./ckpts.

Install dependencies

We recommend using a virtual environment.

python3.6 -m venv --system-site-packages ./venv
source ./venv/bin/activate

Install the necessary dependencies by

python3.6 -m pip install -r requirements.txt

Test scripts

If you wish to evaluate the accuracy of our D-Net (single-view), run

python test_DNet.py ./test_scripts/dnet/scannet.txt
python test_DNet.py ./test_scripts/dnet/7scenes.txt
python test_DNet.py ./test_scripts/dnet/kitti_eigen.txt
python test_DNet.py ./test_scripts/dnet/kitti_official.txt

You should get the following results:

Dataset abs_rel abs_diff sq_rel rmse rmse_log irmse log_10 silog a1 a2 a3 NLL
ScanNet 0.1186 0.2070 0.0493 0.2708 0.1461 0.1086 0.0515 10.0098 0.8546 0.9703 0.9928 2.2352
7-Scenes 0.1339 0.2209 0.0549 0.2932 0.1677 0.1165 0.0566 12.8807 0.8308 0.9716 0.9948 2.7941
KITTI (eigen) 0.0605 1.1331 0.2086 2.4215 0.0921 0.0075 0.0261 8.4312 0.9602 0.9946 0.9989 2.6443
KITTI (official) 0.0629 1.1682 0.2541 2.4708 0.1021 0.0080 0.0270 9.5752 0.9581 0.9905 0.9971 1.7810

In order to evaluate the accuracy of the full pipeline (multi-view), run

python test_MaGNet.py ./test_scripts/magnet/scannet.txt
python test_MaGNet.py ./test_scripts/magnet/7scenes.txt
python test_MaGNet.py ./test_scripts/magnet/kitti_eigen.txt
python test_MaGNet.py ./test_scripts/magnet/kitti_official.txt

You should get the following results:

Dataset abs_rel abs_diff sq_rel rmse rmse_log irmse log_10 silog a1 a2 a3 NLL
ScanNet 0.0810 0.1466 0.0302 0.2098 0.1101 0.1055 0.0351 8.7686 0.9298 0.9835 0.9946 0.1454
7-Scenes 0.1257 0.2133 0.0552 0.2957 0.1639 0.1782 0.0527 13.6210 0.8552 0.9715 0.9935 1.5605
KITTI (eigen) 0.0535 0.9995 0.1623 2.1584 0.0826 0.0566 0.0235 7.4645 0.9714 0.9958 0.9990 1.8053
KITTI (official) 0.0503 0.9135 0.1667 1.9707 0.0848 0.2423 0.0219 7.9451 0.9769 0.9941 0.9979 1.4750

Training scripts

Coming soon

Citation

If you find our work useful in your research please consider citing our paper:

@InProceedings{Bae2022,
  title = {Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry}
  author = {Gwangbin Bae and Ignas Budvytis and Roberto Cipolla},
  booktitle = {Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year = {2022}                         
}
Owner
Bae, Gwangbin
PhD student in Computer Vision @ University of Cambridge
Bae, Gwangbin
Dashboard for the COVID19 spread

COVID-19 Data Explorer App A streamlit Dashboard for the COVID-19 spread. The app is live at: [https://covid19.cwerner.ai]. New data is queried from G

Christian Werner 22 Sep 29, 2022
PyTorch implementation of "Supervised Contrastive Learning" (and SimCLR incidentally)

PyTorch implementation of "Supervised Contrastive Learning" (and SimCLR incidentally)

Yonglong Tian 2.2k Jan 08, 2023
Official Pytorch Implementation of 3DV2021 paper: SAFA: Structure Aware Face Animation.

SAFA: Structure Aware Face Animation (3DV2021) Official Pytorch Implementation of 3DV2021 paper: SAFA: Structure Aware Face Animation. Getting Started

QiulinW 122 Dec 23, 2022
Official project website for the CVPR 2021 paper "Exploring intermediate representation for monocular vehicle pose estimation"

EgoNet Official project website for the CVPR 2021 paper "Exploring intermediate representation for monocular vehicle pose estimation". This repo inclu

Shichao Li 138 Dec 09, 2022
K Closest Points and Maximum Clique Pruning for Efficient and Effective 3D Laser Scan Matching (To appear in RA-L 2022)

KCP The official implementation of KCP: k Closest Points and Maximum Clique Pruning for Efficient and Effective 3D Laser Scan Matching, accepted for p

Yu-Kai Lin 109 Dec 14, 2022
for taichi voxel-challange event

Taichi Voxel Challenge Figure: result of python3 example6.py. Please replace the image above (demo.jpg) with yours, so that other people can immediate

Liming Xu 20 Nov 26, 2022
YOLOv2 in PyTorch

YOLOv2 in PyTorch NOTE: This project is no longer maintained and may not compatible with the newest pytorch (after 0.4.0). This is a PyTorch implement

Long Chen 1.5k Jan 02, 2023
[ArXiv 2021] Data-Efficient Instance Generation from Instance Discrimination

InsGen - Data-Efficient Instance Generation from Instance Discrimination Data-Efficient Instance Generation from Instance Discrimination Ceyuan Yang,

GenForce: May Generative Force Be with You 93 Dec 25, 2022
Cross-modal Deep Face Normals with Deactivable Skip Connections

Cross-modal Deep Face Normals with Deactivable Skip Connections Victoria Fernández Abrevaya*, Adnane Boukhayma*, Philip H. S. Torr, Edmond Boyer (*Equ

72 Nov 27, 2022
Explaining in Style: Training a GAN to explain a classifier in StyleSpace

Explaining in Style: Official TensorFlow Colab Explaining in Style: Training a GAN to explain a classifier in StyleSpace Oran Lang, Yossi Gandelsman,

Google 197 Nov 08, 2022
GLaRA: Graph-based Labeling Rule Augmentation for Weakly Supervised Named Entity Recognition

GLaRA: Graph-based Labeling Rule Augmentation for Weakly Supervised Named Entity Recognition

Xinyan Zhao 29 Dec 26, 2022
Image data augmentation scheduler for albumentations transforms

albu_scheduler Scheduler for albumentations transforms based on PyTorch schedulers interface Usage TransformMultiStepScheduler import albumentations a

19 Aug 04, 2021
Turning SymPy expressions into JAX functions

sympy2jax Turn SymPy expressions into parametrized, differentiable, vectorizable, JAX functions. All SymPy floats become trainable input parameters. S

Miles Cranmer 38 Dec 11, 2022
Open source simulator for autonomous vehicles built on Unreal Engine / Unity, from Microsoft AI & Research

Welcome to AirSim AirSim is a simulator for drones, cars and more, built on Unreal Engine (we now also have an experimental Unity release). It is open

Microsoft 13.8k Jan 05, 2023
A Distributional Approach To Controlled Text Generation

A Distributional Approach To Controlled Text Generation This is the repository code for the ICLR 2021 paper "A Distributional Approach to Controlled T

NAVER 102 Jan 07, 2023
Flexible-CLmser: Regularized Feedback Connections for Biomedical Image Segmentation

Flexible-CLmser: Regularized Feedback Connections for Biomedical Image Segmentation The skip connections in U-Net pass features from the levels of enc

Boheng Cao 1 Dec 29, 2021
This is the official code for the paper "Ad2Attack: Adaptive Adversarial Attack for Real-Time UAV Tracking".

Ad^2Attack:Adaptive Adversarial Attack on Real-Time UAV Tracking Demo video 📹 Our video on bilibili demonstrates the test results of Ad^2Attack on se

Intelligent Vision for Robotics in Complex Environment 10 Nov 07, 2022
[CVPR2021 Oral] End-to-End Video Instance Segmentation with Transformers

VisTR: End-to-End Video Instance Segmentation with Transformers This is the official implementation of the VisTR paper: Installation We provide instru

Yuqing Wang 687 Jan 07, 2023
Alias-Free Generative Adversarial Networks (StyleGAN3) Official PyTorch implementation

Alias-Free Generative Adversarial Networks (StyleGAN3) Official PyTorch implementation

NVIDIA Research Projects 4.8k Jan 09, 2023
Automatic deep learning for image classification.

AutoDL AutoDL automates machine learning tasks enabling you to easily achieve strong predictive performance in your applications. With just a few line

wenqi 2 Oct 12, 2022