NeurIPS 2021, self-supervised 6D pose on category level

Overview

SE(3)-eSCOPE

video | paper | website

Leveraging SE(3) Equivariance for Self-Supervised Category-Level Object Pose Estimation

Xiaolong Li, Yijia Weng, Li Yi , Leonidas Guibas, A. Lynn Abbott, Shuran Song, He Wang

NeurIPS 2021

SE(3)-eSCOPE is a self-supervised learning framework to estimate category-level 6D object pose from single 3D point clouds, with no ground-truth pose annotations, no GT CAD models, and no multi-view supervision during training. The key to our method is to disentangle shape and pose through an invariant shape reconstruction module and an equivariant pose estimation module, empowered by SE(3) equivariant point cloud networks and reconstruction loss.

News

[2021-11] We release the training code for 5 categories.

Prerequisites

The code is built and tested with following libraries:

  • Python>=3.6
  • PyTorch/1.7.1
  • gcc>=6.1.0
  • cmake
  • cuda/11.0.1, or cuda/11.1 for newer GPUs
  • cudnn

Recommended Installation

# 1. install python environments
conda create --name equi-pose python=3.6
source activate equi-pose
pip install torch==1.7.1+cu110 torchvision==0.8.2+cu110 torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html
pip install -r requirements.txt

# 2. compile extra CUDA libraries
bash build.sh

Data Preparation

You could find the subset we use for ModelNet40 directly [drive_link], and our rendered depth point clouds dataset [drive_link], download and put them into your own 'data' folder. check global_info.py for codes and data paths.

Training

You may run the following code to train the model from scratch:

python main.py exp_num=[experiment_id] training=[name_training] datasets=[name_dataset] category=[name_category]

For example, to train the model on completet airplane, you may run

python main.py exp_num='1.0' training="complete_pcloud" dataset="modelnet40_complete" category='airplane' use_wandb=True

Testing Pretrained Models

Some of our pretrained checkpoints have been released, check [drive_link]. Put them in the 'second_path/models' folder. You can run the following command to test the performance;

python main.py exp_num=[experiment_id] training=[name_training] datasets=[name_dataset] category=[name_category] eval=True save=True

For example, to test the model on complete airplane category or partial airplane, you may run

python main.py exp_num='0.813' training="complete_pcloud" dataset="modelnet40_complete" category='airplane'
eval=True save=True
python main.py exp_num='0.913r' training="partial_pcloud" dataset="modelnet40_partial" category='airplane' eval=True save=True

Note: add "use_fps_points=True" to get slightly better results; for your own datasets, add 'pre_compute_delta=True' and use example canonical shapes to compute pose misalignment first.

Visualization

Check out my script demo.py or teaser.py for some hints.

Citation

If you use this code for your research, please cite our paper.

@inproceedings{li2021leveraging,
    title={Leveraging SE (3) Equivariance for Self-supervised Category-Level Object Pose Estimation from Point Clouds},
    author={Li, Xiaolong and Weng, Yijia and Yi, Li and Guibas, Leonidas and Abbott, A Lynn and Song, Shuran and Wang, He},
    booktitle={Thirty-Fifth Conference on Neural Information Processing Systems},
    year={2021}
  }

We thank Haiwei Chen for the helpful discussions on equivariant neural networks.

Owner
Xiaolong
PhD student in Computer Vision, Virginia Tech
Xiaolong
[ICCV 2021 Oral] Mining Latent Classes for Few-shot Segmentation

Mining Latent Classes for Few-shot Segmentation Lihe Yang, Wei Zhuo, Lei Qi, Yinghuan Shi, Yang Gao. This codebase contains baseline of our paper Mini

Lihe Yang 66 Nov 29, 2022
AugLy is a data augmentations library that currently supports four modalities (audio, image, text & video) and over 100 augmentations

AugLy is a data augmentations library that currently supports four modalities (audio, image, text & video) and over 100 augmentations. Each modality’s augmentations are contained within its own sub-l

Facebook Research 4.6k Jan 09, 2023
「PyTorch Implementation of AnimeGANv2」を用いて、生成した顔画像を元の画像に上書きするデモ

AnimeGANv2-Face-Overlay-Demo PyTorch Implementation of AnimeGANv2を用いて、生成した顔画像を元の画像に上書きするデモです。

KazuhitoTakahashi 21 Oct 18, 2022
Official code for the ICLR 2021 paper Neural ODE Processes

Neural ODE Processes Official code for the paper Neural ODE Processes (ICLR 2021). Abstract Neural Ordinary Differential Equations (NODEs) use a neura

Cristian Bodnar 50 Oct 28, 2022
ruptures: change point detection in Python

Welcome to ruptures ruptures is a Python library for off-line change point detection. This package provides methods for the analysis and segmentation

Charles T. 1.1k Jan 03, 2023
The aim of the game, as in the original one, is to find a specific image from a group of different images of a person's face

GUESS WHO Main Links: [Github] [App] Related Links: [CLIP] [Celeba] The aim of the game, as in the original one, is to find a specific image from a gr

Arnau - DIMAI 3 Jan 04, 2022
(to be released) [NeurIPS'21] Transformers Generalize DeepSets and Can be Extended to Graphs and Hypergraphs

Higher-Order Transformers Kim J, Oh S, Hong S, Transformers Generalize DeepSets and Can be Extended to Graphs and Hypergraphs, NeurIPS 2021. [arxiv] W

Jinwoo Kim 44 Dec 28, 2022
PyTorch implementation of Tacotron speech synthesis model.

tacotron_pytorch PyTorch implementation of Tacotron speech synthesis model. Inspired from keithito/tacotron. Currently not as much good speech quality

Ryuichi Yamamoto 279 Dec 09, 2022
Lightweight tool to perform MITM attack on local network

ARPSpy - A lightweight tool to perform MITM attack Using many library to perform ARP Spoof and auto-sniffing HTTP packet containing credential. (Never

MinhItachi 8 Aug 28, 2022
Blind visual quality assessment on 360° Video based on progressive learning

Blind visual quality assessment on omnidirectional or 360 video (ProVQA) Blind VQA for 360° Video via Progressively Learning from Pixels, Frames and V

5 Jan 06, 2023
Official Pytorch implementation of ICLR 2018 paper Deep Learning for Physical Processes: Integrating Prior Scientific Knowledge.

Deep Learning for Physical Processes: Integrating Prior Scientific Knowledge: Official Pytorch implementation of ICLR 2018 paper Deep Learning for Phy

emmanuel 47 Nov 06, 2022
Underwater industrial application yolov5m6

This project wins the intelligent algorithm contest finalist award and stands out from over 2000teams in China Underwater Robot Professional Contest, entering the final of China Underwater Robot Prof

8 Nov 09, 2022
TensorFlow 101: Introduction to Deep Learning for Python Within TensorFlow

TensorFlow 101: Introduction to Deep Learning I have worked all my life in Machine Learning, and I've never seen one algorithm knock over its benchmar

Sefik Ilkin Serengil 896 Jan 04, 2023
BasicNeuralNetwork - This project looks over the basic structure of a neural network and how machine learning training algorithms work

BasicNeuralNetwork - This project looks over the basic structure of a neural network and how machine learning training algorithms work. For this project, I used the sigmoid function as an activation

Manas Bommakanti 1 Jan 22, 2022
Music source separation is a task to separate audio recordings into individual sources

Music Source Separation Music source separation is a task to separate audio recordings into individual sources. This repository is an PyTorch implmeme

Bytedance Inc. 958 Jan 03, 2023
Self-Supervised Deep Blind Video Super-Resolution

Self-Blind-VSR Paper | Discussion Self-Supervised Deep Blind Video Super-Resolution By Haoran Bai and Jinshan Pan Abstract Existing deep learning-base

Haoran Bai 35 Dec 09, 2022
Visyerres sgdf woob - Modules Woob pour l'intranet et autres sites Scouts et Guides de France

Vis'Yerres SGDF - Modules Woob Vous avez le sentiment que l'intranet des Scouts

Thomas Touhey (pas un pseudonyme) 3 Dec 24, 2022
Unofficial Implementation of MLP-Mixer, Image Classification Model

MLP-Mixer Unoffical Implementation of MLP-Mixer, easy to use with terminal. Train and test easly. https://arxiv.org/abs/2105.01601 MLP-Mixer is an arc

Oğuzhan Ercan 6 Dec 05, 2022
Air Quality Prediction Using LSTM

AirQualityPredictionUsingLSTM In this Repo, i present to you the winning solution of smart gujarat hackathon 2019 where the task was to predict the qu

Deepak Nandwani 2 Dec 13, 2022
Using Machine Learning to Create High-Res Fine Art

BIG.art: Using Machine Learning to Create High-Res Fine Art How to use GLIDE and BSRGAN to create ultra-high-resolution paintings with fine details By

Robert A. Gonsalves 13 Nov 27, 2022