Code for "PVNet: Pixel-wise Voting Network for 6DoF Pose Estimation" CVPR 2019 oral

Related tags

Deep Learningpvnet
Overview

Good news! We release a clean version of PVNet: clean-pvnet, including

  1. how to train the PVNet on the custom dataset.
  2. Use PVNet with a detector.
  3. The training and testing on the tless dataset, where we detect multiple instances in an image.

PVNet: Pixel-wise Voting Network for 6DoF Pose Estimation

introduction

PVNet: Pixel-wise Voting Network for 6DoF Pose Estimation
Sida Peng, Yuan Liu, Qixing Huang, Xiaowei Zhou, Hujun Bao
CVPR 2019 oral
Project Page

Any questions or discussions are welcomed!

Truncation LINEMOD Dataset

Check TRUNCATION_LINEMOD.md for information about the Truncation LINEMOD dataset.

Installation

One way is to set up the environment with docker: How to install pvnet with docker.

Thanks Joe Dinius for providing the docker implementation.

Another way is to use the following commands.

  1. Set up python 3.6.7 environment
pip install -r requirements.txt

We need compile several files, which works fine with pytorch v0.4.1/v1.1 and gcc 5.4.0.

For users with a RTX GPU, you must use CUDA10 and pytorch v1.1 built from CUDA10.

  1. Compile the Ransac Voting Layer
ROOT=/path/to/pvnet
cd $ROOT/lib/ransac_voting_gpu_layer
python setup.py build_ext --inplace
  1. Compile some extension utils
cd $ROOT/lib/utils/extend_utils

Revise the cuda_include and dart in build_extend_utils_cffi.py to be compatible with the CUDA in your computer.

sudo apt-get install libgoogle-glog-dev=0.3.4-0.1
sudo apt-get install libsuitesparse-dev=1:4.4.6-1
sudo apt-get install libatlas-base-dev=3.10.2-9
python build_extend_utils_cffi.py

If you cannot install libsuitesparse-dev=1:4.4.6-1, please install libsuitesparse, run build_ceres.sh and move ceres/ceres-solver/build/lib/libceres.so* to lib/utils/extend_utils/lib.

Add the lib under extend_utils to the LD_LIBRARY_PATH

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/path/to/pvnet/lib/utils/extend_utils/lib

Dataset Configuration

Prepare the dataset

Download the LINEMOD, which can be found at here.

Download the LINEMOD_ORIG, which can be found at here.

Download the OCCLUSION_LINEMOD, which can be found at here.

Create the soft link

mkdir $ROOT/data
ln -s path/to/LINEMOD $ROOT/data/LINEMOD
ln -s path/to/LINEMOD_ORIG $ROOT/data/LINEMOD_ORIG
ln -s path/to/OCCLUSION_LINEMOD $ROOT/data/OCCLUSION_LINEMOD

Compute FPS keypoints

python lib/utils/data_utils.py

Synthesize images for each object

See pvnet-rendering for information about the image synthesis.

Demo

Download the pretrained model of cat from here and put it to $ROOT/data/model/cat_demo/199.pth.

Run the demo

python tools/demo.py

If setup correctly, the output will look like

cat

Visualization of the voting procedure

We add a jupyter notebook visualization.ipynb for the keypoint detection pipeline of PVNet, aiming to make it easier for readers to understand our paper. Thanks for Kudlur, M 's suggestion.

Training and testing

Training on the LINEMOD

Before training, remember to add the lib under extend_utils to the LD_LIDBRARY_PATH

export LD_LIDBRARY_PATH=$LD_LIDBRARY_PATH:/path/to/pvnet/lib/utils/extend_utils/lib

Training

python tools/train_linemod.py --cfg_file configs/linemod_train.json --linemod_cls cat

Testing

We provide the pretrained models of each object, which can be found at here.

Download the pretrained model and move it to $ROOT/data/model/{cls}_linemod_train/199.pth. For instance

mkdir $ROOT/data/model
mv ape_199.pth $ROOT/data/model/ape_linemod_train/199.pth

Testing

python tools/train_linemod.py --cfg_file configs/linemod_train.json --linemod_cls cat --test_model

Citation

If you find this code useful for your research, please use the following BibTeX entry.

@inproceedings{peng2019pvnet,
  title={PVNet: Pixel-wise Voting Network for 6DoF Pose Estimation},
  author={Peng, Sida and Liu, Yuan and Huang, Qixing and Zhou, Xiaowei and Bao, Hujun},
  booktitle={CVPR},
  year={2019}
}

Acknowledgement

This work is affliated with ZJU-SenseTime Joint Lab of 3D Vision, and its intellectual property belongs to SenseTime Group Ltd.

Copyright (c) ZJU-SenseTime Joint Lab of 3D Vision. All Rights Reserved.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
Owner
ZJU3DV
ZJU3DV is a research group of State Key Lab of CAD&CG, Zhejiang University. We focus on the research of 3D computer vision, SLAM and AR.
ZJU3DV
PyTorch implementation of "ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context" (INTERSPEECH 2020)

ContextNet ContextNet has CNN-RNN-transducer architecture and features a fully convolutional encoder that incorporates global context information into

Sangchun Ha 24 Nov 24, 2022
The pytorch implementation of SOKD (BMVC2021).

Semi-Online Knowledge Distillation Implementations of SOKD. Requirements This repo was tested with Python 3.8, PyTorch 1.5.1, torchvision 0.6.1, CUDA

4 Dec 19, 2021
[NeurIPS 2021] "G-PATE: Scalable Differentially Private Data Generator via Private Aggregation of Teacher Discriminators"

G-PATE This is the official code base for our NeurIPS 2021 paper: "G-PATE: Scalable Differentially Private Data Generator via Private Aggregation of T

AI Secure 14 Oct 12, 2022
Official implementation of the Implicit Behavioral Cloning (IBC) algorithm

Implicit Behavioral Cloning This codebase contains the official implementation of the Implicit Behavioral Cloning (IBC) algorithm from our paper: Impl

Google Research 210 Dec 09, 2022
code for paper"A High-precision Semantic Segmentation Method Combining Adversarial Learning and Attention Mechanism"

PyTorch implementation of UAGAN(U-net Attention Generative Adversarial Networks) This repository contains the source code for the paper "A High-precis

Tong 8 Apr 25, 2022
CIFS: Improving Adversarial Robustness of CNNs via Channel-wise Importance-based Feature Selection

CIFS This repository provides codes for CIFS (ICML 2021). CIFS: Improving Adversarial Robustness of CNNs via Channel-wise Importance-based Feature Sel

Hanshu YAN 19 Nov 12, 2022
Code for paper " AdderNet: Do We Really Need Multiplications in Deep Learning?"

AdderNet: Do We Really Need Multiplications in Deep Learning? This code is a demo of CVPR 2020 paper AdderNet: Do We Really Need Multiplications in De

HUAWEI Noah's Ark Lab 915 Jan 01, 2023
CLIP + VQGAN / PixelDraw

clipit Yet Another VQGAN-CLIP Codebase This started as a fork of @nerdyrodent's VQGAN-CLIP code which was based on the notebooks of @RiversWithWings a

dribnet 276 Dec 12, 2022
Publication describing 3 ML examples at NSLS-II and interfacing into Bluesky

Machine learning enabling high-throughput and remote operations at large-scale user facilities. Overview This repository contains the source code and

BNL 4 Sep 24, 2022
PyTorch version implementation of DORN

DORN_PyTorch This is a PyTorch version implementation of DORN Reference H. Fu, M. Gong, C. Wang, K. Batmanghelich and D. Tao: Deep Ordinal Regression

Zilin.Zhang 3 Apr 27, 2022
Implementation of NÜWA, state of the art attention network for text to video synthesis, in Pytorch

NÜWA - Pytorch (wip) Implementation of NÜWA, state of the art attention network for text to video synthesis, in Pytorch. This repository will be popul

Phil Wang 463 Dec 28, 2022
Code for CVPR 2018 paper --- Texture Mapping for 3D Reconstruction with RGB-D Sensor

G2LTex This repository contains the implementation of "Texture Mapping for 3D Reconstruction with RGB-D Sensor (CVPR2018)" based on mvs-texturing. Due

Fu Yanping(付燕平) 129 Dec 30, 2022
Improving 3D Object Detection with Channel-wise Transformer

"Improving 3D Object Detection with Channel-wise Transformer" Thanks for the OpenPCDet, this implementation of the CT3D is mainly based on the pcdet v

Hualian Sheng 107 Dec 20, 2022
deep_image_prior_extension

Code for "Is Deep Image Prior in Need of a Good Education?" Project page: https://jleuschn.github.io/docs.educated_deep_image_prior/. Supplementary Ma

riccardo barbano 7 Jan 09, 2022
A denoising diffusion probabilistic model (DDPM) tailored for conditional generation of protein distograms

Denoising Diffusion Probabilistic Model for Proteins Implementation of Denoising Diffusion Probabilistic Model in Pytorch. It is a new approach to gen

Phil Wang 108 Nov 23, 2022
A rule learning algorithm for the deduction of syndrome definitions from time series data.

README This project provides a rule learning algorithm for the deduction of syndrome definitions from time series data. Large parts of the algorithm a

0 Sep 24, 2021
A new data augmentation method for extreme lighting conditions.

Random Shadows and Highlights This repo has the source code for the paper: Random Shadows and Highlights: A new data augmentation method for extreme l

Osama Mazhar 35 Nov 26, 2022
This is an official repository of CLGo: Learning to Predict 3D Lane Shape and Camera Pose from a Single Image via Geometry Constraints

CLGo This is an official repository of CLGo: Learning to Predict 3D Lane Shape and Camera Pose from a Single Image via Geometry Constraints An earlier

刘芮金 32 Dec 20, 2022
Bounding Wasserstein distance with couplings

BoundWasserstein These scripts reproduce the results of the article Bounding Wasserstein distance with couplings by Niloy Biswas and Lester Mackey. ar

Niloy Biswas 1 Jan 11, 2022
A Library for Modelling Probabilistic Hierarchical Graphical Models in PyTorch

A Library for Modelling Probabilistic Hierarchical Graphical Models in PyTorch

Korbinian Pöppel 47 Nov 28, 2022