Deep Two-View Structure-from-Motion Revisited

Overview

Deep Two-View Structure-from-Motion Revisited

This repository provides the code for our CVPR 2021 paper Deep Two-View Structure-from-Motion Revisited.

We have provided the functions for training, validating, and visualization.

Note: some config flags are designed for ablation study, and we have a plan to re-org the codes later. Please feel free to submit issues if you feel confused about some parts.

Requirements

Python = 3.6.x
Pytorch >= 1.6.0
CUDA >= 10.1

and the others could be installed by

pip install -r requirements.txt

Pytorch from 1.1.0 to 1.6.0 should also work well, but it will disenable mixed precision training, and we have not tested it.

To use the RANSAC five-point algorithm, you also need to

cd RANSAC_FiveP

python setup.py install --user

The CUDA extension would be installed as 'essential_matrix'. Tested under Ubuntu and CUDA 10.1.

Models

Pretrained models are provided here.

KITTI Depth

To reproduce our results, please first download the KITTI dataset RAW data and 14GB official depth maps. You should also download the split files provided by us, and unzip them into the root of the KITTI raw data. Then, modify the gt_depth_dir (KITTI_loader.py, L278) to the address of KITTI official depth maps.

For training,

python main.py -b 32 --lr 0.0005 --nlabel 128 --fix_flownet \
--data PATH/TO/YOUR/KITTI/DATASET --cfg cfgs/kitti.yml \
--pretrained-depth depth_init.pth.tar --pretrained-flow flow_init.pth.tar

For evaluation,

python main.py -v -b 1 -p 1 --nlabel 128 \
--data PATH/TO/YOUR/KITTI/DATASET --cfg cfgs/kitti.yml \
--pretrained kitti.pth.tar"

The default evaluation split is Eigen, where the metric abs_rel should be around 0.053 and rmse should be close to 2.22. If you would like to use the Eigen SfM split, please set cfg.EIGEN_SFM = True and cfg.KITTI_697 = False.

KITTI Pose

For fair comparison, we use a KITTI odometry evaluation toolbox as provided here. Please generate poses by sequence, and evaluate the results correspondingly.

Acknowledgment:

Thanks Shihao Jiang and Dylan Campbell for sharing the implementation of the GPU-accelerated RANSAC Five-point algorithm. We really appreciate the valuable feedback from our area chairs and reviewers. We would like to thank Charles Loop for helpful discussions and Ke Chen for providing field test images from NVIDIA AV cars.

BibTex:

@article{wang2021deep,
  title={Deep Two-View Structure-from-Motion Revisited},
  author={Wang, Jianyuan and Zhong, Yiran and Dai, Yuchao and Birchfield, Stan and Zhang, Kaihao and Smolyanskiy, Nikolai and Li, Hongdong},
  journal={CVPR},
  year={2021}
}
Owner
Jianyuan Wang
Computer Vision
Jianyuan Wang
Public implementation of "Learning from Suboptimal Demonstration via Self-Supervised Reward Regression" from CoRL'21

Self-Supervised Reward Regression (SSRR) Codebase for CoRL 2021 paper "Learning from Suboptimal Demonstration via Self-Supervised Reward Regression "

19 Dec 12, 2022
PPLNN is a Primitive Library for Neural Network is a high-performance deep-learning inference engine for efficient AI inferencing

PPLNN is a Primitive Library for Neural Network is a high-performance deep-learning inference engine for efficient AI inferencing

943 Jan 07, 2023
High-Resolution Image Synthesis with Latent Diffusion Models

Latent Diffusion Models Requirements A suitable conda environment named ldm can be created and activated with: conda env create -f environment.yaml co

CompVis Heidelberg 5.6k Jan 04, 2023
Simple PyTorch implementations of Badnets on MNIST and CIFAR10.

Simple PyTorch implementations of Badnets on MNIST and CIFAR10.

Vera 75 Dec 13, 2022
A pure PyTorch batched computation implementation of "CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition"

A pure PyTorch batched computation implementation of "CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition"

張致強 14 Dec 02, 2022
Source codes for Improved Few-Shot Visual Classification (CVPR 2020), Enhancing Few-Shot Image Classification with Unlabelled Examples

Source codes for Improved Few-Shot Visual Classification (CVPR 2020), Enhancing Few-Shot Image Classification with Unlabelled Examples (WACV 2022) and Beyond Simple Meta-Learning: Multi-Purpose Model

PLAI Group at UBC 42 Dec 06, 2022
Python module providing a framework to trace individual edges in an image using Gaussian process regression.

Edge Tracing using Gaussian Process Regression Repository storing python module which implements a framework to trace individual edges in an image usi

Jamie Burke 7 Dec 27, 2022
Measuring Coding Challenge Competence With APPS

Measuring Coding Challenge Competence With APPS This is the repository for Measuring Coding Challenge Competence With APPS by Dan Hendrycks*, Steven B

Dan Hendrycks 218 Dec 27, 2022
This example implements the end-to-end MLOps process using Vertex AI platform and Smart Analytics technology capabilities

MLOps with Vertex AI This example implements the end-to-end MLOps process using Vertex AI platform and Smart Analytics technology capabilities. The ex

Google Cloud Platform 238 Dec 21, 2022
OpenAi's gym environment wrapper to vectorize them with Ray

Ray Vector Environment Wrapper You would like to use Ray to vectorize your environment but you don't want to use RLLib ? You came to the right place !

Pierre TASSEL 15 Nov 10, 2022
The official repository for our paper "The Neural Data Router: Adaptive Control Flow in Transformers Improves Systematic Generalization".

Codebase for learning control flow in transformers The official repository for our paper "The Neural Data Router: Adaptive Control Flow in Transformer

Csordás Róbert 24 Oct 15, 2022
一个多语言支持、易使用的 OCR 项目。An easy-to-use OCR project with multilingual support.

AgentOCR 简介 AgentOCR 是一个基于 PaddleOCR 和 ONNXRuntime 项目开发的一个使用简单、调用方便的 OCR 项目 本项目目前包含 Python Package 【AgentOCR】 和 OCR 标注软件 【AgentOCRLabeling】 使用指南 Pytho

AgentMaker 98 Nov 10, 2022
Detecting drunk people through thermal images using Deep Learning (CNN)

Drunk Detection CNN Detecting drunk people through thermal images using Deep Learning (CNN) Dataset We used thermal images provided by Electronics Lab

Giacomo Ferretti 3 Oct 27, 2022
LoL Runes Recommender With Python

LoL-Runes-Recommender Para ejecutar la aplicación se debe llamar a execute_app.p

Sebastián Salinas 1 Jan 10, 2022
[ICCV2021] Safety-aware Motion Prediction with Unseen Vehicles for Autonomous Driving

Safety-aware Motion Prediction with Unseen Vehicles for Autonomous Driving Safety-aware Motion Prediction with Unseen Vehicles for Autonomous Driving

Xuanchi Ren 44 Dec 03, 2022
COLMAP - Structure-from-Motion and Multi-View Stereo

COLMAP About COLMAP is a general-purpose Structure-from-Motion (SfM) and Multi-View Stereo (MVS) pipeline with a graphical and command-line interface.

4.7k Jan 07, 2023
CAR-API: Cityscapes Attributes Recognition API

CAR-API: Cityscapes Attributes Recognition API This is the official api to download and fetch attributes annotations for Cityscapes Dataset. Content I

Kareem Metwaly 5 Dec 22, 2022
A scanpy extension to analyse single-cell TCR and BCR data.

Scirpy: A Scanpy extension for analyzing single-cell immune-cell receptor sequencing data Scirpy is a scalable python-toolkit to analyse T cell recept

ICBI 145 Jan 03, 2023
SEAN: Image Synthesis with Semantic Region-Adaptive Normalization (CVPR 2020, Oral)

SEAN: Image Synthesis with Semantic Region-Adaptive Normalization (CVPR 2020 Oral) Figure: Face image editing controlled via style images and segmenta

Peihao Zhu 579 Dec 30, 2022
MAGMA - a GPT-style multimodal model that can understand any combination of images and language

MAGMA -- Multimodal Augmentation of Generative Models through Adapter-based Finetuning Authors repo (alphabetical) Constantin (CoEich), Mayukh (Mayukh

Aleph Alpha GmbH 331 Jan 03, 2023