On the Limits of Pseudo Ground Truth in Visual Camera Re-Localization

Overview

On the Limits of Pseudo Ground Truth in Visual Camera Re-Localization

This repository contains the evaluation code and alternative pseudo ground truth poses as used in our ICCV 2021 paper.

video overview

Pseudo Ground Truth for 7Scenes and 12Scenes

We generated alternative SfM-based pseudo ground truth (pGT) using Colmap to supplement the original D-SLAM-based pseudo ground truth of 7Scenes and 12Scenes.

Pose Files

Please find our SfM pose files in the folder pgt. We separated pGT files wrt datasets, individual scenes and the test/training split. Each file contains one line per image that follows the format:

rgb_file qw qx qy qz tx ty tz f

Entries q and t represent the pose as quaternion and translation vector. The pose maps world coordinates to camera coordinates, i.e. p_cam = R(q) p_world + t. This is the same convention used by Colmap. Entry f represents the focal length of the RGB sensor. f was re-estimated by COLMAP and can differ slightly per scene.

We also provide the original D-SLAM pseudo ground truth in this format to be used with our evaluation code below.

Full Reconstructions

The Colmap 3D models are available here:

Note that the Google Drive folder that currently hosts the reconstructions has a daily download limit. We are currently looking into alternative hosting options.

License Information

Since the 3D models and pose files are derived from the original datasets, they are released under the same licences as the 7Scenes and 12Scenes datasets. Before using the datasets, please check the licenses (see the websites of the datasets or the README.md files that come with the 3D models).

Evaluation Code

The main results of our paper can be reproduced using evaluate_estimates.py. The script calculates either the pose error (max of rotation and translation error) or the DCRE error (dense reprojection error). The script prints the recall at a custom threshold to the console, and produces a cumulative error plot as a PDF file.

As input, the script expects a configuration file that points to estimated poses of potentially multiple algorithms and to the pseudo ground truth that these estimates should be compared to. We provide estimated poses of all methods shown in our paper (ActiveSearch, HLoc, R2D2 and DSAC*) in the folder estimates.
These pose files follow the same format as our pGT files described previously, but omit the final f entry.

Furthermore, we provide example config files corresponding to the main experiments in our paper.

Call python evaluate_estimates.py --help for all available options.

For evaluation on 7Scenes, using our SfM pGT, call:

python evaluate_estimates.py config_7scenes_sfm_pgt.json

This produces a new file config_7scenes_sfm_pgt_pose_err.pdf:

For the corresponding plot using the original D-SLAM pGT, call:

python evaluate_estimates.py config_7scenes_dslam_pgt.json

Interpreting the Results

The plots above show very different rankings across methods. Yet, as we discuss in our paper, both plots are valid since no version of the pGT is clearly superior to the other. Furthermore, it appears plausible that any version of pGT is only trustworthy up to a certain accuracy threshold. However, it is non-obvious and currently unknown, how to determine such a trust threshold. We thus strongly discourage to draw any conclusions (beyond that a method might be overfitting to the imperfections of the pseudo ground truth) from the smaller thresholds alone.

We advise to always evaluate methods under both versions of the pGT, and to show both evaluation results in juxtaposition unless specific reasons are given why one version of the pGT is preferred.

DCRE Computation

DCRE computation is triggered with the option --error_type dcre_max or --error_type dcre_mean (see our paper for details). DCRE needs access to the original 7Scenes or 12Scenes data as it requires depth maps. We provide two utility scripts, setup_7scenes.py and setup_12scenes.py, that will download and unpack the associated datasets. Make sure to check each datasets license, via the links above, before downloading and using them.

Note I: The original depth files of 7Scenes are not calibrated, but the DCRE requires calibrated files. The setup script will apply the Kinect calibration parameters found here to register depth to RGB. This essentially involves re-rendering the depth maps which is implemented in native Python and takes a long time due to the large frame count in 7Scenes (several hours). However, this step has to be done only once.

Note II: The DCRE computation by evaluate_estimates.py is implemented on the GPU and reasonably fast. However, due to the large frame count in 7Scenes it can still take considerable time. The parameter --error_max_images limits the max. number of frames used to calculate recall and cumulative errors. The default value of 1000 provides a good tradeoff between accuracy and speed. Use --error_max_images -1 to use all images which is most accurate but slow for 7Scenes.

Uploading Your Method's Estimates

We are happy to include updated evaluation results or evaluation results of new methods in this repository. This would enable easy comparisons across methods with unified evaluation code, as we progress in the field.

If you want your results included, please provide estimates of your method under both pGT versions via a pull request. Please add your estimation files to a custom sub-folder under èstimates_external, following our pose file convention described above. We would also ask that you provide a text file that links your results to a publication or tech report, or contains a description of how you obtained these results.

estimates_external
├── someone_elses_method
└── your_method
    ├── info_your_method.txt
    ├── dslam
    │   ├── 7scenes
    │   │   ├── chess_your_method.txt
    │   │   ├── fire_your_method.txt
    │   │   ├── ...
    │   └── 12scenes
    │       ├── ...
    └── sfm
        ├── ...

Dependencies

This code requires the following python packages, and we tested it with the package versions in brackets

pytorch (1.6.0)
opencv (3.4.2)
scikit-image (0.16.2)

The repository contains an environment.yml for the use with Conda:

conda env create -f environment.yml
conda activate pgt

License Information

Our evaluation code and data utility scripts are based on parts of DSAC*, and we provide our code under the same BSD-3 license.

Citation

If you are using either the evaluation code or the Structure-from-Motion pseudo GT for the 7Scenes or 12Scenes datasets, please cite the following work:

@InProceedings{Brachmann2021ICCV,
    author = {Brachmann, Eric and Humenberger, Martin and Rother, Carsten and Sattler, Torsten},
    title = {{On the Limits of Pseudo Ground Truth in Visual Camera Re-Localization}},
    booktitle = {International Conference on Computer Vision (ICCV)},
    year = {2021},
}
Owner
Torsten Sattler
I am a senior researcher at CIIRC, the Czech Institute of Informatics, Robotics and Cybernetics, building my own research group.
Torsten Sattler
The official implementation of A Unified Game-Theoretic Interpretation of Adversarial Robustness.

This repository is the official implementation of A Unified Game-Theoretic Interpretation of Adversarial Robustness. Requirements pip install -r requi

Jie Ren 17 Dec 12, 2022
A real-time approach for mapping all human pixels of 2D RGB images to a 3D surface-based model of the body

DensePose: Dense Human Pose Estimation In The Wild Rıza Alp Güler, Natalia Neverova, Iasonas Kokkinos [densepose.org] [arXiv] [BibTeX] Dense human pos

Meta Research 6.4k Jan 01, 2023
Pun Detection and Location

Pun Detection and Location “The Boating Store Had Its Best Sail Ever”: Pronunciation-attentive Contextualized Pun Recognition Yichao Zhou, Jyun-yu Jia

lawson 3 May 13, 2022
This repository contains the implementation of the HealthGen model, a generative model to synthesize realistic EHR time series data with missingness

HealthGen: Conditional EHR Time Series Generation This repository contains the implementation of the HealthGen model, a generative model to synthesize

0 Jan 20, 2022
PaddlePaddle GAN library, including lots of interesting applications like First-Order motion transfer, wav2lip, picture repair, image editing, photo2cartoon, image style transfer, and so on.

English | 简体中文 PaddleGAN PaddleGAN provides developers with high-performance implementation of classic and SOTA Generative Adversarial Networks, and s

6.4k Jan 09, 2023
Tensorflow Implementation of the paper "Spectral Normalization for Generative Adversarial Networks" (ICML 2017 workshop)

tf-SNDCGAN Tensorflow implementation of the paper "Spectral Normalization for Generative Adversarial Networks" (https://www.researchgate.net/publicati

Nhat M. Nguyen 248 Nov 25, 2022
Final report with code for KAIST Course KSE 801.

Orthogonal collocation is a method for the numerical solution of partial differential equations

Chuanbo HUA 4 Apr 06, 2022
Causal Imitative Model for Autonomous Driving

Causal Imitative Model for Autonomous Driving Mohammad Reza Samsami, Mohammadhossein Bahari, Saber Salehkaleybar, Alexandre Alahi. arXiv 2021. [Projec

VITA lab at EPFL 8 Oct 04, 2022
Torch-based tool for quantizing high-dimensional vectors using additive codebooks

Trainable multi-codebook quantization This repository implements a utility for use with PyTorch, and ideally GPUs, for training an efficient quantizer

Daniel Povey 41 Jan 07, 2023
TensorFlow implementation of the algorithm in the paper "Decoupled Low-light Image Enhancement"

Decoupled Low-light Image Enhancement Shijie Hao1,2*, Xu Han1,2, Yanrong Guo1,2 & Meng Wang1,2 1Key Laboratory of Knowledge Engineering with Big Data

17 Apr 25, 2022
Unsupervised Semantic Segmentation by Contrasting Object Mask Proposals.

Unsupervised Semantic Segmentation by Contrasting Object Mask Proposals This repo contains the Pytorch implementation of our paper: Unsupervised Seman

Wouter Van Gansbeke 335 Dec 28, 2022
Charsiu: A transformer-based phonetic aligner

Charsiu: A transformer-based phonetic aligner [arXiv] Note. This is a preview version. The aligner is under active development. New functions, new lan

jzhu 166 Dec 09, 2022
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

Salesforce 1.3k Dec 31, 2022
Code implementing "Improving Deep Learning Interpretability by Saliency Guided Training"

Saliency Guided Training Code implementing "Improving Deep Learning Interpretability by Saliency Guided Training" by Aya Abdelsalam Ismail, Hector Cor

8 Sep 22, 2022
Repo for 2021 SDD assessment task 2, by Felix, Anna, and James.

SoftwareTask2 Repo for 2021 SDD assessment task 2, by Felix, Anna, and James. File/folder structure: helloworld.py - demonstrates various map backgrou

3 Dec 13, 2022
STMTrack: Template-free Visual Tracking with Space-time Memory Networks

STMTrack This is the official implementation of the paper: STMTrack: Template-free Visual Tracking with Space-time Memory Networks. Setup Prepare Anac

Zhihong Fu 62 Dec 21, 2022
Supporting code for short YouTube series Neural Networks Demystified.

Neural Networks Demystified Supporting iPython notebooks for the YouTube Series Neural Networks Demystified. I've included formulas, code, and the tex

Stephen 1.3k Dec 23, 2022
[SIGMETRICS 2022] One Proxy Device Is Enough for Hardware-Aware Neural Architecture Search

One Proxy Device Is Enough for Hardware-Aware Neural Architecture Search paper | website One Proxy Device Is Enough for Hardware-Aware Neural Architec

10 Dec 16, 2022
[ICLR 2022 Oral] F8Net: Fixed-Point 8-bit Only Multiplication for Network Quantization

F8Net Fixed-Point 8-bit Only Multiplication for Network Quantization (ICLR 2022 Oral) OpenReview | arXiv | PDF | Model Zoo | BibTex PyTorch implementa

Snap Research 76 Dec 13, 2022
RoboDesk A Multi-Task Reinforcement Learning Benchmark

RoboDesk A Multi-Task Reinforcement Learning Benchmark If you find this open source release useful, please reference in your paper: @misc{kannan2021ro

Google Research 66 Oct 07, 2022