(CVPR 2022 - oral) Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry

Overview

Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry

Official implementation of the paper

Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry

CVPR 2022 [oral]

Gwangbin Bae, Ignas Budvytis, and Roberto Cipolla

[arXiv]

We present MaGNet (Monocular and Geometric Network), a novel framework for fusing single-view depth probability with multi-view geometry, to improve the accuracy, robustness and efficiency of multi-view depth estimation. For each frame, MaGNet estimates a single-view depth probability distribution, parameterized as a pixel-wise Gaussian. The distribution estimated for the reference frame is then used to sample per-pixel depth candidates. Such probabilistic sampling enables the network to achieve higher accuracy while evaluating fewer depth candidates. We also propose depth consistency weighting for the multi-view matching score, to ensure that the multi-view depth is consistent with the single-view predictions. The proposed method achieves state-of-the-art performance on ScanNet, 7-Scenes and KITTI. Qualitative evaluation demonstrates that our method is more robust against challenging artifacts such as texture-less/reflective surfaces and moving objects.

Datasets

We evaluated MaGNet on ScanNet, 7-Scenes and KITTI

ScanNet

  • In order to download ScanNet, you should submit an agreement to the Terms of Use. Please follow the instructions in this link.
  • The folder should be organized as

/path/to/ScanNet
/path/to/ScanNet/scans
/path/to/ScanNet/scans/scene0000_00 ...
/path/to/ScanNet/scans_test
/path/to/ScanNet/scans_test/scene0707_00 ...

7-Scenes

  • Download all seven scenes (Chess, Fire, Heads, Office, Pumpkin, RedKitchen, Stairs) from this link.
  • The folder should be organized as:

/path/to/SevenScenes
/path/to/SevenScenes/chess ...

KITTI

  • Download raw data from this link.
  • Download depth maps from this link
  • The folder should be organized as:

/path/to/KITTI
/path/to/KITTI/rawdata
/path/to/KITTI/rawdata/2011_09_26 ...
/path/to/KITTI/train
/path/to/KITTI/train/2011_09_26_drive_0001_sync ...
/path/to/KITTI/val
/path/to/KITTI/val/2011_09_26_drive_0002_sync ...

Download model weights

Download model weights by

python ckpts/download.py

If some files are not downloaded properly, download them manually from this link and place the files under ./ckpts.

Install dependencies

We recommend using a virtual environment.

python3.6 -m venv --system-site-packages ./venv
source ./venv/bin/activate

Install the necessary dependencies by

python3.6 -m pip install -r requirements.txt

Test scripts

If you wish to evaluate the accuracy of our D-Net (single-view), run

python test_DNet.py ./test_scripts/dnet/scannet.txt
python test_DNet.py ./test_scripts/dnet/7scenes.txt
python test_DNet.py ./test_scripts/dnet/kitti_eigen.txt
python test_DNet.py ./test_scripts/dnet/kitti_official.txt

You should get the following results:

Dataset abs_rel abs_diff sq_rel rmse rmse_log irmse log_10 silog a1 a2 a3 NLL
ScanNet 0.1186 0.2070 0.0493 0.2708 0.1461 0.1086 0.0515 10.0098 0.8546 0.9703 0.9928 2.2352
7-Scenes 0.1339 0.2209 0.0549 0.2932 0.1677 0.1165 0.0566 12.8807 0.8308 0.9716 0.9948 2.7941
KITTI (eigen) 0.0605 1.1331 0.2086 2.4215 0.0921 0.0075 0.0261 8.4312 0.9602 0.9946 0.9989 2.6443
KITTI (official) 0.0629 1.1682 0.2541 2.4708 0.1021 0.0080 0.0270 9.5752 0.9581 0.9905 0.9971 1.7810

In order to evaluate the accuracy of the full pipeline (multi-view), run

python test_MaGNet.py ./test_scripts/magnet/scannet.txt
python test_MaGNet.py ./test_scripts/magnet/7scenes.txt
python test_MaGNet.py ./test_scripts/magnet/kitti_eigen.txt
python test_MaGNet.py ./test_scripts/magnet/kitti_official.txt

You should get the following results:

Dataset abs_rel abs_diff sq_rel rmse rmse_log irmse log_10 silog a1 a2 a3 NLL
ScanNet 0.0810 0.1466 0.0302 0.2098 0.1101 0.1055 0.0351 8.7686 0.9298 0.9835 0.9946 0.1454
7-Scenes 0.1257 0.2133 0.0552 0.2957 0.1639 0.1782 0.0527 13.6210 0.8552 0.9715 0.9935 1.5605
KITTI (eigen) 0.0535 0.9995 0.1623 2.1584 0.0826 0.0566 0.0235 7.4645 0.9714 0.9958 0.9990 1.8053
KITTI (official) 0.0503 0.9135 0.1667 1.9707 0.0848 0.2423 0.0219 7.9451 0.9769 0.9941 0.9979 1.4750

Training scripts

Coming soon

Citation

If you find our work useful in your research please consider citing our paper:

@InProceedings{Bae2022,
  title = {Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry}
  author = {Gwangbin Bae and Ignas Budvytis and Roberto Cipolla},
  booktitle = {Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year = {2022}                         
}
Owner
Bae, Gwangbin
PhD student in Computer Vision @ University of Cambridge
Bae, Gwangbin
Prososdy Morph: A python library for manipulating pitch and duration in an algorithmic way, for resynthesizing speech.

ProMo (Prosody Morph) Questions? Comments? Feedback? Chat with us on gitter! A library for manipulating pitch and duration in an algorithmic way, for

Tim 71 Jan 02, 2023
Problem-943.-ACMP - Problem 943. ACMP

Problem-943.-ACMP В "main.py" расположен вариант моего решения задачи 943 с серв

Konstantin Dyomshin 2 Aug 19, 2022
YOLOv4-v3 Training Automation API for Linux

This repository allows you to get started with training a state-of-the-art Deep Learning model with little to no configuration needed! You provide your labeled dataset or label your dataset using our

BMW TechOffice MUNICH 626 Dec 31, 2022
This is a GUI interface which can process forest fire detection, smoke detection and fire segmentation

This is a GUI interface which can process forest fire detection, smoke detection and fire segmentation. Yolov5 is used to detect fire and smoke and unet is used to segment fire.

7 Jan 08, 2023
Namish Khanna 40 Oct 11, 2022
A Japanese Medical Information Extraction Toolkit

JaMIE: a Japanese Medical Information Extraction toolkit Joint Japanese Medical Problem, Modality and Relation Recognition The Train/Test phrases requ

7 Dec 12, 2022
Data Augmentation Using Keras and Python

Data-Augmentation-Using-Keras-and-Python Data augmentation is the process of increasing the number of training dataset. Keras library offers a simple

Happy N. Monday 3 Feb 15, 2022
PyTorch implementation of the REMIND method from our ECCV-2020 paper "REMIND Your Neural Network to Prevent Catastrophic Forgetting"

REMIND Your Neural Network to Prevent Catastrophic Forgetting This is a PyTorch implementation of the REMIND algorithm from our ECCV-2020 paper. An ar

Tyler Hayes 72 Nov 27, 2022
Code for Towards Streaming Perception (ECCV 2020) :car:

sAP — Code for Towards Streaming Perception ECCV Best Paper Honorable Mention Award Feb 2021: Announcing the Streaming Perception Challenge (CVPR 2021

Martin Li 85 Dec 22, 2022
[PNAS2021] The neural architecture of language: Integrative modeling converges on predictive processing

The neural architecture of language: Integrative modeling converges on predictive processing Code accompanying the paper The neural architecture of la

Martin Schrimpf 36 Dec 01, 2022
Gluon CV Toolkit

Gluon CV Toolkit | Installation | Documentation | Tutorials | GluonCV provides implementations of the state-of-the-art (SOTA) deep learning models in

Distributed (Deep) Machine Learning Community 5.4k Jan 06, 2023
Approaches to modeling terrain and maps in python

topography 🌎 Contains different approaches to modeling terrain and topographic-style maps in python Features Inverse Distance Weighting (IDW) A given

John Gutierrez 1 Aug 10, 2022
Testing the Facial Emotion Recognition (FER) algorithm on animations

PegHeads-Tutorial-3 Testing the Facial Emotion Recognition (FER) algorithm on animations

PegHeads Inc 2 Jan 03, 2022
Cross-media Structured Common Space for Multimedia Event Extraction (ACL2020)

Cross-media Structured Common Space for Multimedia Event Extraction Table of Contents Overview Requirements Data Quickstart Citation Overview The code

Manling Li 49 Nov 21, 2022
MLOps will help you to understand how to build a Continuous Integration and Continuous Delivery pipeline for an ML/AI project.

page_type languages products description sample python azure azure-machine-learning-service azure-devops Code which demonstrates how to set up and ope

1 Nov 01, 2021
Code release for "MERLOT Reserve: Neural Script Knowledge through Vision and Language and Sound"

merlot_reserve Code release for "MERLOT Reserve: Neural Script Knowledge through Vision and Language and Sound" MERLOT Reserve (in submission) is a mo

Rowan Zellers 92 Dec 11, 2022
PyTorch implementation of Decoupling Value and Policy for Generalization in Reinforcement Learning

PyTorch implementation of Decoupling Value and Policy for Generalization in Reinforcement Learning

48 Dec 08, 2022
Process text, including tokenizing and representing sentences as vectors and Applying some concepts like RNN, LSTM and GRU to create a classifier can detect the language in which a sentence is written from among 17 languages.

Language Identifier What is this ? The goal of this project is to create a model that is able to predict a given sentence language through text proces

Hossam Asaad 9 Dec 15, 2022
《Where am I looking at? Joint Location and Orientation Estimation by Cross-View Matching》(CVPR 2020)

This contains the codes for cross-view geo-localization method described in: Where am I looking at? Joint Location and Orientation Estimation by Cross-View Matching, CVPR2020.

41 Oct 27, 2022
A user-friendly research and development tool built to standardize RL competency assessment for custom agents and environments.

Built with ❤️ by Sam Showalter Contents Overview Installation Dependencies Usage Scripts Standard Execution Environment Development Environment Benchm

SRI-AIC 1 Nov 18, 2021