SporeAgent: Reinforced Scene-level Plausibility for Object Pose Refinement

Overview

SporeAgent: Reinforced Scene-level Plausibility for Object Pose Refinement

This repository implements the approach described in SporeAgent: Reinforced Scene-level Plausibility for Object Pose Refinement (WACV 2022).

Iterative refinement using SporeAgent

Iterative registration using SporeAgent:
The initial pose from PoseCNN (purple) and the final pose using SporeAgent (blue) on the LINEMOD (left,cropped) and YCB-Video (right) datasets.

Scene-level Plausibility

Scene-level Plausibility:
The initial scene configuration from PoseCNN (left) results in an implausible pose of the target object (gray). Refinement using SporeAgent (right) results in a plausible scene configuration where the intersecting points (red) are resolved and the object rests on its supported points (cyan).

LINEMOD AD < 0.10d AD < 0.05d AD <0.02d YCB-Video ADD AUC AD AUC ADI AUC
PoseCNN 62.7 26.9 3.3 51.5 61.3 75.2
Point-to-Plane ICP 92.6 79.8 29.9 68.2 79.2 87.8
w/ VeREFINE 96.1 85.8 32.5 70.1 81.0 88.8
Multi-hypothesis ICP 99.3 89.9 35.6 77.4 86.6 92.6
SporeAgent 99.3 93.7 50.3 79.0 88.8 93.6

Comparison on LINEMOD and YCB-Video:
The initial pose and segmentation estimates are computed using PoseCNN. We compare our approach to vanilla Point-to-Plane ICP (from Open3D), Point-to-Plane ICP augmented by the simulation-based VeREFINE approach and the ICP-based multi-hypothesis approach used for refinement in PoseCNN.

Dependencies

The code has been tested on Ubuntu 16.04 and 20.04 with Python 3.6 and CUDA 10.2. To set-up the Python environment, use Anaconda and the provided YAML file:

conda env create -f environment.yml --name sporeagent

conda activate sporeagent.

The BOP Toolkit is additionally required. The BOP_PATH in config.py needs to be changed to the respective clone directory and the packages required by the BOP Toolkit need to be installed.

The YCB-Video Toolbox is required for experiments on the YCB-Video dataset.

Datasets

We use the dataset versions prepared for the BOP challenge. The required files can be downloaded to a directory of your choice using the following bash script:

export SRC=http://ptak.felk.cvut.cz/6DB/public/bop_datasets
export DATASET=ycbv                     # either "lm" or "ycbv"
wget $SRC/$DATASET_base.zip             # Base archive with dataset info, camera parameters, etc.
wget $SRC/$DATASET_models.zip           # 3D object models.
wget $SRC/$DATASET_test_all.zip         # All test images.
unzip $DATASET_base.zip                 # Contains folder DATASET.
unzip $DATASET_models.zip -d $DATASET   # Unpacks to DATASET.
unzip $DATASET_test_all.zip -d $DATASET # Unpacks to DATASET.

For training on YCB-Video, the $DATASET_train_real.zip is moreover required.

In addition, we have prepared point clouds sampled within the ground-truth masks (for training) and the segmentation masks computed using PoseCNN (for evaluation) for the LINEMOD and YCB-Video dataset. The samples for evaluation also include the initial pose estimates from PoseCNN.

LINEMOD

Extract the prepared samples into PATH_TO_BOP_LM/sporeagent/ and set LM_PATH in config.py to the base directory, i.e., PATH_TO_BOP_LM. Download the PoseCNN results and the corresponding image set definitions provided with DeepIM and extract both into POSECNN_LM_RESULTS_PATH. Finally, since the BOP challenge uses a different train/test split than the compared methods, the appropriate target file found here needs to be placed in the PATH_TO_BOP_LM directory.

To compute the AD scores using the BOP Toolkit, BOP_PATH/scripts/eval_bop19.py needs to be adapted:

  • to use ADI for symmetric objects and ADD otherwise with a 2/5/10% threshold, change p['errors'] to
{
  'n_top': -1,
  'type': 'ad',
  'correct_th': [[0.02], [0.05], [0.1]]
}
  • to use the correct test targets, change p['targets_filename'] to 'test_targets_add.json'

YCB-Video

Extract the prepared samples into PATH_TO_BOP_YCBV/reagent/ and set YCBV_PATH in config.py to the base directory, i.e., PATH_TO_BOP_YCBV. Clone the YCB Video Toolbox to POSECNN_YCBV_RESULTS_PATH. Extract the results_PoseCNN_RSS2018.zip and copy test_data_list.txt to the same directory. The POSECNN_YCBV_RESULTS_PATH in config.py needs to be changed to the respective directory. Additionally, place the meshes in the canonical frame models_eval_canonical in the PATH_TO_BOP_YCBV directory.

To compute the ADD/AD/ADI AUC scores using the YCB-Video Toolbox, replace the respective files in the toolbox by the ones provided in sporeagent/ycbv_toolbox.

Pretrained models

Weights for both datasets can be found here. Download and copy them to sporeagent/weights/.

Training

For LINEMOD: python registration/train.py --dataset=lm

For YCB-Video: python registration/train.py --dataset=ycbv

Evaluation

Note that we precompute the normal images used for pose scoring on the first run and store them to disk.

LINEMOD

The results for LINEMOD are computed using the BOP Toolkit. The evaluation script exports the required file by running

python registration/eval.py --dataset=lm,

which can then be processed via

python BOP_PATH/scripts/eval_bop19.py --result_filenames=PATH_TO_CSV_WITH_RESULTS.

YCB-Video

The results for YCB-Video are computed using the YCB-Video Toolbox. The evaluation script exports the results in BOP format by running

python registration/eval.py --dataset=ycbv,

which can then be parsed into the format used by the YCB-Video Toolbox by running

python utility/parse_matlab.py.

In MATLAB, run evaluate_poses_keyframe.m to generate the per-sample results and plot_accuracy_keyframe.m to compute the statistics.

Citation

If you use this repository in your publications, please cite

@article{bauer2022sporeagent,
    title={SporeAgent: Reinforced Scene-level Plausibility for Object Pose Refinement},
    author={Bauer, Dominik and Patten, Timothy and Vincze, Markus},
    booktitle={IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
    year={2022},
    pages={654-662}
}
Owner
Dominik Bauer
Dominik Bauer
A project that uses optical flow and machine learning to detect aimhacking in video clips.

waldo-anticheat A project that aims to use optical flow and machine learning to visually detect cheating or hacking in video clips from fps games. Che

waldo.vision 542 Dec 03, 2022
GNN4Traffic - This is the repository for the collection of Graph Neural Network for Traffic Forecasting

GNN4Traffic - This is the repository for the collection of Graph Neural Network for Traffic Forecasting

564 Jan 02, 2023
Out-of-Domain Human Mesh Reconstruction via Dynamic Bilevel Online Adaptation

DynaBOA Code repositoty for the paper: Out-of-Domain Human Mesh Reconstruction via Dynamic Bilevel Online Adaptation Shanyan Guan, Jingwei Xu, Michell

198 Dec 29, 2022
TJU Deep Learning & Neural Network

Deep_Learning & Neural_Network_Lab 实验环境 Python 3.9 Anaconda3(官网下载或清华镜像都行) PyTorch 1.10.1(安装代码如下) conda install pytorch torchvision torchaudio cudatool

St3ve Lee 1 Jan 19, 2022
A modular active learning framework for Python

Modular Active Learning framework for Python3 Page contents Introduction Active learning from bird's-eye view modAL in action From zero to one in a fe

modAL 1.9k Dec 31, 2022
Angora is a mutation-based fuzzer. The main goal of Angora is to increase branch coverage by solving path constraints without symbolic execution.

Angora Angora is a mutation-based coverage guided fuzzer. The main goal of Angora is to increase branch coverage by solving path constraints without s

833 Jan 07, 2023
Trains an agent with stochastic policy gradient ascent to solve the Lunar Lander challenge from OpenAI

Introduction This script trains an agent with stochastic policy gradient ascent to solve the Lunar Lander challenge from OpenAI. In order to run this

Momin Haider 0 Jan 02, 2022
Weakly-supervised semantic image segmentation with CNNs using point supervision

Code for our ECCV paper What's the Point: Semantic Segmentation with Point Supervision. Summary This library is a custom build of Caffe for semantic i

27 Sep 14, 2022
Source Code for AAAI 2022 paper "Graph Convolutional Networks with Dual Message Passing for Subgraph Isomorphism Counting and Matching"

Graph Convolutional Networks with Dual Message Passing for Subgraph Isomorphism Counting and Matching This repository is an official implementation of

HKUST-KnowComp 13 Sep 08, 2022
The VeriNet toolkit for verification of neural networks

VeriNet The VeriNet toolkit is a state-of-the-art sound and complete symbolic interval propagation based toolkit for verification of neural networks.

9 Dec 21, 2022
EDPN: Enhanced Deep Pyramid Network for Blurry Image Restoration

EDPN: Enhanced Deep Pyramid Network for Blurry Image Restoration Ruikang Xu, Zeyu Xiao, Jie Huang, Yueyi Zhang, Zhiwei Xiong. EDPN: Enhanced Deep Pyra

69 Dec 15, 2022
The codes and related files to reproduce the results for Image Similarity Challenge Track 1.

ISC-Track1-Submission The codes and related files to reproduce the results for Image Similarity Challenge Track 1. Required dependencies To begin with

Wenhao Wang 115 Jan 02, 2023
The implementation of CVPR2021 paper Temporal Query Networks for Fine-grained Video Understanding, by Chuhan Zhang, Ankush Gupta and Andrew Zisserman.

Temporal Query Networks for Fine-grained Video Understanding 📋 This repository contains the implementation of CVPR2021 paper Temporal_Query_Networks

55 Dec 21, 2022
Official and maintained implementation of the paper "OSS-Net: Memory Efficient High Resolution Semantic Segmentation of 3D Medical Data" [BMVC 2021].

OSS-Net: Memory Efficient High Resolution Semantic Segmentation of 3D Medical Data Christoph Reich, Tim Prangemeier, Özdemir Cetin & Heinz Koeppl | Pr

Christoph Reich 23 Sep 21, 2022
Monitora la qualità della ricezione dei segnali radio nelle province siciliane.

FMap-server Monitora la qualità della ricezione dei segnali radio nelle province siciliane. Conversion data Frequency - StationName maps are stored in

Triglie 5 May 24, 2021
A Loss Function for Generative Neural Networks Based on Watson’s Perceptual Model

This repository contains the similarity metrics designed and evaluated in the paper, and instructions and code to re-run the experiments. Implementation in the deep-learning framework PyTorch

Steffen 86 Dec 27, 2022
Python interface for the DIGIT tactile sensor

DIGIT-INTERFACE Python interface for the DIGIT tactile sensor. For updates and discussions please join the #DIGIT channel at the www.touch-sensing.org

Facebook Research 35 Dec 22, 2022
This code is a toolbox that uses Torch library for training and evaluating the ERFNet architecture for semantic segmentation.

ERFNet This code is a toolbox that uses Torch library for training and evaluating the ERFNet architecture for semantic segmentation. NEW!! New PyTorch

Edu 104 Jan 05, 2023
Leveraging OpenAI's Codex to solve cornerstone problems in Music

Music-Codex Leveraging OpenAI's Codex to solve cornerstone problems in Music Please NOTE: Presented generated samples were created by OpenAI's Codex P

Alex 2 Mar 11, 2022
[CVPR2021] Look before you leap: learning landmark features for one-stage visual grounding.

LBYL-Net This repo implements paper Look Before You Leap: Learning Landmark Features For One-Stage Visual Grounding CVPR 2021. Getting Started Prerequ

SVIP Lab 45 Dec 12, 2022