NeurIPS 2021, self-supervised 6D pose on category level

Overview

SE(3)-eSCOPE

video | paper | website

Leveraging SE(3) Equivariance for Self-Supervised Category-Level Object Pose Estimation

Xiaolong Li, Yijia Weng, Li Yi , Leonidas Guibas, A. Lynn Abbott, Shuran Song, He Wang

NeurIPS 2021

SE(3)-eSCOPE is a self-supervised learning framework to estimate category-level 6D object pose from single 3D point clouds, with no ground-truth pose annotations, no GT CAD models, and no multi-view supervision during training. The key to our method is to disentangle shape and pose through an invariant shape reconstruction module and an equivariant pose estimation module, empowered by SE(3) equivariant point cloud networks and reconstruction loss.

News

[2021-11] We release the training code for 5 categories.

Prerequisites

The code is built and tested with following libraries:

  • Python>=3.6
  • PyTorch/1.7.1
  • gcc>=6.1.0
  • cmake
  • cuda/11.0.1, or cuda/11.1 for newer GPUs
  • cudnn

Recommended Installation

# 1. install python environments
conda create --name equi-pose python=3.6
source activate equi-pose
pip install torch==1.7.1+cu110 torchvision==0.8.2+cu110 torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html
pip install -r requirements.txt

# 2. compile extra CUDA libraries
bash build.sh

Data Preparation

You could find the subset we use for ModelNet40 directly [drive_link], and our rendered depth point clouds dataset [drive_link], download and put them into your own 'data' folder. check global_info.py for codes and data paths.

Training

You may run the following code to train the model from scratch:

python main.py exp_num=[experiment_id] training=[name_training] datasets=[name_dataset] category=[name_category]

For example, to train the model on completet airplane, you may run

python main.py exp_num='1.0' training="complete_pcloud" dataset="modelnet40_complete" category='airplane' use_wandb=True

Testing Pretrained Models

Some of our pretrained checkpoints have been released, check [drive_link]. Put them in the 'second_path/models' folder. You can run the following command to test the performance;

python main.py exp_num=[experiment_id] training=[name_training] datasets=[name_dataset] category=[name_category] eval=True save=True

For example, to test the model on complete airplane category or partial airplane, you may run

python main.py exp_num='0.813' training="complete_pcloud" dataset="modelnet40_complete" category='airplane'
eval=True save=True
python main.py exp_num='0.913r' training="partial_pcloud" dataset="modelnet40_partial" category='airplane' eval=True save=True

Note: add "use_fps_points=True" to get slightly better results; for your own datasets, add 'pre_compute_delta=True' and use example canonical shapes to compute pose misalignment first.

Visualization

Check out my script demo.py or teaser.py for some hints.

Citation

If you use this code for your research, please cite our paper.

@inproceedings{li2021leveraging,
    title={Leveraging SE (3) Equivariance for Self-supervised Category-Level Object Pose Estimation from Point Clouds},
    author={Li, Xiaolong and Weng, Yijia and Yi, Li and Guibas, Leonidas and Abbott, A Lynn and Song, Shuran and Wang, He},
    booktitle={Thirty-Fifth Conference on Neural Information Processing Systems},
    year={2021}
  }

We thank Haiwei Chen for the helpful discussions on equivariant neural networks.

Owner
Xiaolong
PhD student in Computer Vision, Virginia Tech
Xiaolong
PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis

PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis

Ubisoft 76 Dec 30, 2022
Using LSTM to detect spoofing attacks in an Air-Ground network

Using LSTM to detect spoofing attacks in an Air-Ground network Specifications IDE: Spider Packages: Tensorflow 2.1.0 Keras NumPy Scikit-learn Matplotl

Tiep M. H. 1 Nov 20, 2021
Code for ICCV2021 paper PARE: Part Attention Regressor for 3D Human Body Estimation

PARE: Part Attention Regressor for 3D Human Body Estimation [ICCV 2021] PARE: Part Attention Regressor for 3D Human Body Estimation, Muhammed Kocabas,

Muhammed Kocabas 277 Jan 03, 2023
Linear image-to-image translation

Linear (Un)supervised Image-to-Image Translation Examples for linear orthogonal transformations in PCA domain, learned without pairing supervision. Tr

Eitan Richardson 40 Aug 31, 2022
SSL_SLAM2: Lightweight 3-D Localization and Mapping for Solid-State LiDAR (mapping and localization separated) ICRA 2021

SSL_SLAM2 Lightweight 3-D Localization and Mapping for Solid-State LiDAR (Intel Realsense L515 as an example) This repo is an extension work of SSL_SL

Wang Han 王晗 1.3k Jan 08, 2023
MNIST, but with Bezier curves instead of pixels

bezier-mnist This is a work-in-progress vector version of the MNIST dataset. Samples Here are some samples from the training set. Note that, while the

Alex Nichol 15 Jan 16, 2022
Datasets, tools, and benchmarks for representation learning of code.

The CodeSearchNet challenge has been concluded We would like to thank all participants for their submissions and we hope that this challenge provided

GitHub 1.8k Dec 25, 2022
PyTorch Implementation of "Non-Autoregressive Neural Machine Translation"

Non-Autoregressive Transformer Code release for Non-Autoregressive Neural Machine Translation by Jiatao Gu, James Bradbury, Caiming Xiong, Victor O.K.

Salesforce 261 Nov 12, 2022
Machine learning notebooks in different subjects optimized to run in google collaboratory

Notebooks Name Description Category Link Training pix2pix This notebook shows a simple pipeline for training pix2pix on a simple dataset. Most of the

Zaid Alyafeai 363 Dec 06, 2022
CrossMLP - The repository offers the official implementation of our BMVC 2021 paper (oral) in PyTorch.

CrossMLP Cascaded Cross MLP-Mixer GANs for Cross-View Image Translation Bin Ren1, Hao Tang2, Nicu Sebe1. 1University of Trento, Italy, 2ETH, Switzerla

Bingoren 16 Jul 27, 2022
Official implementation for "Symbolic Learning to Optimize: Towards Interpretability and Scalability"

Symbolic Learning to Optimize This is the official implementation for ICLR-2022 paper "Symbolic Learning to Optimize: Towards Interpretability and Sca

VITA 8 Dec 19, 2022
Offical implementation for "Trash or Treasure? An Interactive Dual-Stream Strategy for Single Image Reflection Separation".

Trash or Treasure? An Interactive Dual-Stream Strategy for Single Image Reflection Separation (NeurIPS 2021) by Qiming Hu, Xiaojie Guo. Dependencies P

Qiming Hu 31 Dec 20, 2022
Point detection through multi-instance deep heatmap regression for sutures in endoscopy

Suture detection PyTorch This repo contains the reference implementation of suture detection model in PyTorch for the paper Point detection through mu

artificial intelligence in the area of cardiovascular healthcare 3 Jul 16, 2022
Machine learning algorithms for many-body quantum systems

NetKet NetKet is an open-source project delivering cutting-edge methods for the study of many-body quantum systems with artificial neural networks and

NetKet 413 Dec 31, 2022
A set of examples around hub for creating and processing datasets

Examples for Hub - Dataset Format for AI A repository showcasing examples of using Hub Uploading Dataset Places365 Colab Tutorials Notebook Link Getti

Activeloop 11 Dec 14, 2022
Official repository of OFA. Paper: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

Paper | Blog OFA is a unified multimodal pretrained model that unifies modalities (i.e., cross-modality, vision, language) and tasks (e.g., image gene

OFA Sys 1.4k Jan 08, 2023
Tensors and Dynamic neural networks in Python with strong GPU acceleration

PyTorch is a Python package that provides two high-level features: Tensor computation (like NumPy) with strong GPU acceleration Deep neural networks b

61.4k Jan 04, 2023
Unsupervised Image Generation with Infinite Generative Adversarial Networks

Unsupervised Image Generation with Infinite Generative Adversarial Networks Here is the implementation of MICGANs using DCGAN architecture on MNIST da

16 Dec 24, 2021
TOOD: Task-aligned One-stage Object Detection, ICCV2021 Oral

One-stage object detection is commonly implemented by optimizing two sub-tasks: object classification and localization, using heads with two parallel branches, which might lead to a certain level of

264 Jan 09, 2023
REBEL: Relation Extraction By End-to-end Language generation

REBEL: Relation Extraction By End-to-end Language generation This is the repository for the Findings of EMNLP 2021 paper REBEL: Relation Extraction By

Babelscape 222 Jan 06, 2023