The Surprising Effectiveness of Visual Odometry Techniques for Embodied PointGoal Navigation

Last update: Dec 15, 2022

Related tags

Deep Learning PointNav-VO

Overview

PointNav-VO

The Surprising Effectiveness of Visual Odometry Techniques for Embodied PointGoal Navigation

Project Page | Paper

Setup
Reproduction
Plug-and-play
Train
Citation

Setup

Install Dependencies

conda env create -f environment.yml

Install Habitat

The repo is tested under the following commits of habitat-lab and habitat-sim.

habitat-lab == d0db1b55be57abbacc5563dca2ca14654c545552
habitat-sim == 020041d75eaf3c70378a9ed0774b5c67b9d3ce99

Note, to align with Habitat Challenge 2020 settings (see Step 36 in the Dockerfile), when installing habitat-sim, we compiled without CUDA support as

python setup.py install --headless

There was a discrepancy between noises models in CPU and CPU versions which has now been fixed, see this issue. Therefore, to reproduce the results in the paper with our pre-trained weights, you need to use noises model of CPU-version.

Download Data

We need two datasets to enable running of this repo:

Gibson scene dataset
PointGoal Navigation splits, we need pointnav_gibson_v2.zip.

Please follow Habitat's instruction to download them. We assume all data is put under ./dataset with structure:

.
+-- dataset
|  +-- Gibson
|  |  +-- gibson
|  |  |  +-- Adrian.glb
|  |  |  +-- Adrian.navmesh
|  |  |  ...
|  +-- habitat_datasets
|  |  +-- pointnav
|  |  |  +-- gibson
|  |  |  |  +-- v2
|  |  |  |  |  +-- train
|  |  |  |  |  +-- val
|  |  |  |  |  +-- valmini

Reproduce

Download pretrained checkpoints of RL navigation policy and VO from this link. Put them under pretrained_ckpts with the following structure:

.
+-- pretrained_ckpts
|  +-- rl
|  |  +-- no_tune
|  |  |  +-- rl_no_tune.pth
|  |  +-- tune_vo
|  |  |  +-- rl_tune_vo.pth
|  +-- vo
|  |  +-- act_forward.pth
|  |  +-- act_left_right_inv_joint.pth

Run the following command to reproduce navigation results. On Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz and a Nvidia GeForce GTX 1080 Ti, it takes around 4.5 hours to complete evaluation on all 994 episodes with navigation policy tuned with VO.

cd /path/to/this/repo
export POINTNAV_VO_ROOT=$PWD

export NUMBA_NUM_THREADS=1 && \
export NUMBA_THREADING_LAYER=workqueue && \
conda activate pointnav-vo && \
python ${POINTNAV_VO_ROOT}/launch.py \
--repo-path ${POINTNAV_VO_ROOT} \
--n_gpus 1 \
--task-type rl \
--noise 1 \
--run-type eval \
--addr 127.0.1.1 \
--port 8338

Use VO as a Drop-in Module

We provide a class BaseRLTrainerWithVO that contains all necessary functions to compute odometry in base_trainer_with_vo.py. Specifically, you can use _compute_local_delta_states_from_vo to compute odometry based on adjacent observations. The code sturcture will be something like:

local_delta_states = _compute_local_delta_states_from_vo(prev_obs, cur_obs, action)
cur_goal = compute_goal_pos(prev_goal, local_delta_states)

To get more sense about how to use this function, please refer to challenge2020_agent.py, which is the agent we used in HabitatChallenge 2020.

Train Your Own VO

See details in TRAIN.md

Citation

Please cite the following papers if you found our model useful. Thanks!

Xiaoming Zhao, Harsh Agrawal, Dhruv Batra, and Alexander Schwing. The Surprising Effectiveness of Visual Odometry Techniques for Embodied PointGoal Navigation. ICCV 2021.

@inproceedings{ZhaoICCV2021,
  title={{The Surprising Effectiveness of Visual Odometry Techniques for Embodied PointGoal Navigation}},
  author={Xiaoming Zhao and Harsh Agrawal and Dhruv Batra and Alexander Schwing},
  booktitle={Proc. ICCV},
  year={2021},
}

The Surprising Effectiveness of Visual Odometry Techniques for Embodied PointGoal Navigation

Related tags

Overview

PointNav-VO

Table of Contents

Setup

Install Dependencies

Install Habitat

Download Data

Reproduce

Use VO as a Drop-in Module

Train Your Own VO

Citation

Owner

Xiaoming Zhao

(Personalized) Page-Rank computation using PyTorch

Implementation of Neural Distance Embeddings for Biological Sequences (NeuroSEED) in PyTorch

[CVPR'21 Oral] Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning

Unoffical reMarkable AddOn for Firefox.

An original implementation of "MetaICL Learning to Learn In Context" by Sewon Min, Mike Lewis, Luke Zettlemoyer and Hannaneh Hajishirzi

Tensorflow implementation of Character-Aware Neural Language Models.

A public available dataset for road boundary detection in aerial images

This repo implements several applications of the proposed generalized Bures-Wasserstein (GBW) geometry on symmetric positive definite matrices.

Code for Transformers Solve Limited Receptive Field for Monocular Depth Prediction

The code is an implementation of Feedback Convolutional Neural Network for Visual Localization and Segmentation.

This is a template for the Non-autoregressive Deep Learning-Based TTS model (in PyTorch).

tensorrt int8 量化yolov5 4.0 onnx模型

Repository for "Improving evidential deep learning via multi-task learning," published in AAAI2022

YouRefIt: Embodied Reference Understanding with Language and Gesture

Translate darknet to tensorflow. Load trained weights, retrain/fine-tune using tensorflow, export constant graph def to mobile devices

Official repository for the ICLR 2021 paper Evaluating the Disentanglement of Deep Generative Models with Manifold Topology

This repository consists of Blender python scripts and corresponding assets to generate variants of the CANDLE dataset

Distributing reference energies for SMIRNOFF implementations

Distinguishing Commercial from Editorial Content in News

[ACM MM 2021] TSA-Net: Tube Self-Attention Network for Action Quality Assessment