Official Pytorch implementation of RePOSE (ICCV2021)

Last update: Nov 15, 2022

Related tags

Overview

RePOSE: Iterative Rendering and Refinement for 6D Object Detection (ICCV2021) [Link]

Abstract

We present RePOSE, a fast iterative refinement method for 6D object pose estimation. Prior methods perform refinement by feeding zoomed-in input and rendered RGB images into a CNN and directly regressing an update of a refined pose. Their runtime is slow due to the computational cost of CNN, which is especially prominent in multiple-object pose refinement. To overcome this problem, RePOSE leverages image rendering for fast feature extraction using a 3D model with a learnable texture. We call this deep texture rendering, which uses a shallow multi-layer perceptron to directly regress a view-invariant image representation of an object. Furthermore, we utilize differentiable Levenberg-Marquardt (LM) optimization to refine a pose fast and accurately by minimizing the feature-metric error between the input and rendered image representations without the need of zooming in. These image representations are trained such that differentiable LM optimization converges within few iterations. Consequently, RePOSE runs at 92 FPS and achieves state-of-the-art accuracy of 51.6% on the Occlusion LineMOD dataset - a 4.1% absolute improvement over the prior art, and comparable result on the YCB-Video dataset with a much faster runtime.

Prerequisites

Python >= 3.6
Pytorch == 1.9.0
Torchvision == 0.10.0
CUDA == 10.1

Downloads

Installation

Set up the python environment:

$ pip install torch==1.9.0 torchvision==0.10.0
$ pip install Cython==0.29.17
$ sudo apt-get install libglfw3-dev libglfw3
$ pip install -r requirements.txt

# Install Differentiable Renderer
$ cd renderer
$ python3 setup.py install

Compile cuda extensions under lib/csrc:

ROOT=/path/to/RePOSE
cd $ROOT/lib/csrc
export CUDA_HOME="/usr/local/cuda-10.1"
cd ../ransac_voting
python setup.py build_ext --inplace
cd ../camera_jacobian
python setup.py build_ext --inplace
cd ../nn
python setup.py build_ext --inplace
cd ../fps
python setup.py

Set up datasets:

$ ROOT=/path/to/RePOSE
$ cd $ROOT/data

$ ln -s /path/to/linemod linemod
$ ln -s /path/to/linemod_orig linemod_orig
$ ln -s /path/to/occlusion_linemod occlusion_linemod

$ cd $ROOT/data/model/
$ unzip pretrained_models.zip

$ cd $ROOT/cache/LinemodTest
$ unzip ape.zip benchvise.zip .... phone.zip
$ cd $ROOT/cache/LinemodOccTest
$ unzip ape.zip can.zip .... holepuncher.zip

Testing

We have 13 categories (ape, benchvise, cam, can, cat, driller, duck, eggbox, glue, holepuncher, iron, lamp, phone) on the LineMOD dataset and 8 categories (ape, can, cat, driller, duck, eggbox, glue, holepuncher) on the Occlusion LineMOD dataset. Please choose the one category you like (replace ape with another category) and perform testing.

Evaluate the ADD(-S) score

Generate the annotation data:

python run.py --type linemod cls_type ape model ape

Test:

# Test on the LineMOD dataset
$ python run.py --type evaluate --cfg_file configs/linemod.yaml cls_type ape model ape

# Test on the Occlusion LineMOD dataset
$ python run.py --type evaluate --cfg_file configs/linemod.yaml test.dataset LinemodOccTest cls_type ape model ape

Visualization

Generate the annotation data:

python run.py --type linemod cls_type ape model ape

Visualize:

# Visualize the results of the LineMOD dataset
python run.py --type visualize --cfg_file configs/linemod.yaml cls_type ape model ape

# Visualize the results of the Occlusion LineMOD dataset
python run.py --type visualize --cfg_file configs/linemod.yaml test.dataset LinemodOccTest cls_type ape model ape

Citation

@InProceedings{Iwase_2021_ICCV,
    author    = {Iwase, Shun and Liu, Xingyu and Khirodkar, Rawal and Yokota, Rio and Kitani, Kris M.},
    title     = {RePOSE: Fast 6D Object Pose Refinement via Deep Texture Rendering},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2021},
    pages     = {3303-3312}
}

Acknowledgement

Our code is largely based on clean-pvnet and our rendering code is based on neural_renderer. Thank you so much for making these codes publicly available!

Contact

If you have any questions about the paper and implementation, please feel free to email me ([email protected])! Thank you!

Official Pytorch implementation of RePOSE (ICCV2021)

Related tags

Overview

RePOSE: Iterative Rendering and Refinement for 6D Object Detection (ICCV2021) [Link]

Abstract

Prerequisites

Downloads

Installation

Testing

Evaluate the ADD(-S) score

Visualization

Citation

Acknowledgement

Contact

Owner

Shun Iwase

DeepSTD: Mining Spatio-temporal Disturbances of Multiple Context Factors for Citywide Traffic Flow Prediction

Command-line tool for downloading and extending the RedCaps dataset.

Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams

Recurrent Scale Approximation (RSA) for Object Detection

API for RL algorithm design & testing of BCA (Building Control Agent) HVAC on EnergyPlus building energy simulator by wrapping their EMS Python API

Offline Reinforcement Learning with Implicit Q-Learning

Official code repository for Continual Learning In Environments With Polynomial Mixing Times

A basic reminder tool written in Python.

Employee-Managment - Company employee registration software in the face recognition system

Crosslingual Segmental Language Model

PyTorch-lightning implementation of the ESFW module proposed in our paper Edge-Selective Feature Weaving for Point Cloud Matching

MMDetection3D is an open source object detection toolbox based on PyTorch

Pytorch implementation of our method for regularizing nerual radiance fields for few-shot neural volume rendering.

SwinTrack: A Simple and Strong Baseline for Transformer Tracking

Implementations of the algorithms in the paper Approximative Algorithms for Multi-Marginal Optimal Transport and Free-Support Wasserstein Barycenters

(to be released) [NeurIPS'21] Transformers Generalize DeepSets and Can be Extended to Graphs and Hypergraphs

PyTorch implementation of MLP-Mixer

一个运行在 𝐞𝐥𝐞𝐜𝐕𝟐𝐏 或 𝐪𝐢𝐧𝐠𝐥𝐨𝐧𝐠 等定时面板的签到项目

Code of U2Fusion: a unified unsupervised image fusion network for multiple image fusion tasks, including multi-modal, multi-exposure and multi-focus image fusion.

Experimenting with computer vision techniques to generate annotated image datasets from gameplay recordings automatically.