Learning RGB-D Feature Embeddings for Unseen Object Instance Segmentation

Overview

Unseen Object Clustering: Learning RGB-D Feature Embeddings for Unseen Object Instance Segmentation

Introduction

In this work, we propose a new method for unseen object instance segmentation by learning RGB-D feature embeddings from synthetic data. A metric learning loss functionis utilized to learn to produce pixel-wise feature embeddings such that pixels from the same object are close to each other and pixels from different objects are separated in the embedding space. With the learned feature embeddings, a mean shift clustering algorithm can be applied to discover and segment unseen objects. We further improve the segmentation accuracy with a new two-stage clustering algorithm. Our method demonstrates that non-photorealistic synthetic RGB and depth images can be used to learn feature embeddings that transfer well to real-world images for unseen object instance segmentation. arXiv, Talk video

License

Unseen Object Clustering is released under the NVIDIA Source Code License (refer to the LICENSE file for details).

Citation

If you find Unseen Object Clustering useful in your research, please consider citing:

@inproceedings{xiang2020learning,
    Author = {Yu Xiang and Christopher Xie and Arsalan Mousavian and Dieter Fox},
    Title = {Learning RGB-D Feature Embeddings for Unseen Object Instance Segmentation},
    booktitle = {Conference on Robot Learning (CoRL)},
    Year = {2020}
}

Required environment

  • Ubuntu 16.04 or above
  • PyTorch 0.4.1 or above
  • CUDA 9.1 or above

Installation

  1. Install PyTorch.

  2. Install python packages

    pip install -r requirement.txt

Download

  • Download our trained checkpoints from here, save to $ROOT/data.

Running the demo

  1. Download our trained checkpoints first.

  2. Run the following script for testing on images under $ROOT/data/demo.

    ./experiments/scripts/demo_rgbd_add.sh

Training and testing on the Tabletop Object Dataset (TOD)

  1. Download the Tabletop Object Dataset (TOD) from here (34G).

  2. Create a symlink for the TOD dataset

    cd $ROOT/data
    ln -s $TOD_DATA tabletop
  3. Training and testing on the TOD dataset

    cd $ROOT
    
    # multi-gpu training, we used 4 GPUs
    ./experiments/scripts/seg_resnet34_8s_embedding_cosine_rgbd_add_train_tabletop.sh
    
    # testing, $GPU_ID can be 0, 1, etc.
    ./experiments/scripts/seg_resnet34_8s_embedding_cosine_rgbd_add_test_tabletop.sh $GPU_ID $EPOCH
    

Testing on the OCID dataset and the OSD dataset

  1. Download the OCID dataset from here, and create a symbol link:

    cd $ROOT/data
    ln -s $OCID_dataset OCID
  2. Download the OSD dataset from here, and create a symbol link:

    cd $ROOT/data
    ln -s $OSD_dataset OSD
  3. Check scripts in experiments/scripts with name test_ocid or test_ocd. Make sure the path of the trained checkpoints exist.

    experiments/scripts/seg_resnet34_8s_embedding_cosine_rgbd_add_test_ocid.sh
    experiments/scripts/seg_resnet34_8s_embedding_cosine_rgbd_add_test_osd.sh
    

Running with ROS on a Realsense camera for real-world unseen object instance segmentation

  • Python2 is needed for ROS.

  • Make sure our pretrained checkpoints are downloaded.

    # start realsense
    roslaunch realsense2_camera rs_aligned_depth.launch tf_prefix:=measured/camera
    
    # start rviz
    rosrun rviz rviz -d ./ros/segmentation.rviz
    
    # run segmentation, $GPU_ID can be 0, 1, etc.
    ./experiments/scripts/ros_seg_rgbd_add_test_segmentation_realsense.sh $GPU_ID

Our example:

Owner
NVIDIA Research Projects
NVIDIA Research Projects
Training PSPNet in Tensorflow. Reproduce the performance from the paper.

Training Reproduce of PSPNet. (Updated 2021/04/09. Authors of PSPNet have provided a Pytorch implementation for PSPNet and their new work with support

Li Xuhong 126 Jul 13, 2022
Python version of the amazing Reaction Mechanism Generator (RMG).

Reaction Mechanism Generator (RMG) Description This repository contains the Python version of Reaction Mechanism Generator (RMG), a tool for automatic

Reaction Mechanism Generator 284 Dec 27, 2022
这是一个yolox-keras的源码,可以用于训练自己的模型。

YOLOX:You Only Look Once目标检测模型在Keras当中的实现 目录 性能情况 Performance 实现的内容 Achievement 所需环境 Environment 小技巧的设置 TricksSet 文件下载 Download 训练步骤 How2train 预测步骤 Ho

Bubbliiiing 64 Nov 10, 2022
Official Pytorch implementation of 'RoI Tanh-polar Transformer Network for Face Parsing in the Wild.'

Official Pytorch implementation of 'RoI Tanh-polar Transformer Network for Face Parsing in the Wild.'

Jie Shen 125 Jan 08, 2023
Trajectory Extraction of road users via Traffic Camera

Traffic Monitoring Citation The associated paper for this project will be published here as soon as possible. When using this software, please cite th

Julian Strosahl 14 Dec 17, 2022
Adversarial examples to the new ConvNeXt architecture

Adversarial examples to the new ConvNeXt architecture To get adversarial examples to the ConvNeXt architecture, run the Colab: https://github.com/stan

Stanislav Fort 19 Sep 18, 2022
Implementation of UNet on the Joey ML framework

Independent Research Project - Code Joey can be cloned from here https://github.com/devitocodes/joey/. Devito and other dependencies such as PyTorch a

Navjot Kukreja 1 Oct 21, 2021
Project code for weakly supervised 3D object detectors using wide-baseline multi-view traffic camera data: WIBAM.

WIBAM (Work in progress) Weakly Supervised Training of Monocular 3D Object Detectors Using Wide Baseline Multi-view Traffic Camera Data 3D object dete

Matthew Howe 10 Aug 24, 2022
AITom is an open-source platform for AI driven cellular electron cryo-tomography analysis.

AITom Introduction AITom is an open-source platform for AI driven cellular electron cryo-tomography analysis. AITom is originated from the tomominer l

93 Jan 02, 2023
Python with OpenCV - MediaPip Framework Hand Detection

Python HandDetection Python with OpenCV - MediaPip Framework Hand Detection Explore the docs » Contact Me About The Project It is a Computer vision pa

2 Jan 07, 2022
Script that attempts to force M1 macs into RGB mode when used with monitors that are defaulting to YPbPr.

fix_m1_rgb Script that attempts to force M1 macs into RGB mode when used with monitors that are defaulting to YPbPr. No warranty provided for using th

Kevin Gao 116 Jan 01, 2023
This repository is for Competition for ML_data class

This repository is for Competition for ML_data class. Based on mmsegmentatoin,mainly using swin transformer to completed the competition.

jianlong 2 Oct 23, 2022
Resources related to our paper "CLIN-X: pre-trained language models and a study on cross-task transfer for concept extraction in the clinical domain"

CLIN-X (CLIN-X-ES) & (CLIN-X-EN) This repository holds the companion code for the system reported in the paper: "CLIN-X: pre-trained language models a

Bosch Research 4 Dec 05, 2022
Conversational text Analysis using various NLP techniques

PyConverse Let me try first Installation pip install pyconverse Usage Please try this notebook that demos the core functionalities: basic usage noteb

Rita Anjana 158 Dec 25, 2022
Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting

Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting This is the origin Pytorch implementation of Informer in the followin

Haoyi 3.1k Dec 29, 2022
An implementation demo of the ICLR 2021 paper Neural Attention Distillation: Erasing Backdoor Triggers from Deep Neural Networks in PyTorch.

Neural Attention Distillation This is an implementation demo of the ICLR 2021 paper Neural Attention Distillation: Erasing Backdoor Triggers from Deep

Yige-Li 84 Jan 04, 2023
DeepHyper: Scalable Asynchronous Neural Architecture and Hyperparameter Search for Deep Neural Networks

What is DeepHyper? DeepHyper is a software package that uses learning, optimization, and parallel computing to automate the design and development of

DeepHyper Team 214 Jan 08, 2023
Source code for our paper "Empathetic Response Generation with State Management"

Source code for our paper "Empathetic Response Generation with State Management" this repository is maintained by both Jun Gao and Yuhan Liu Model Ove

Yuhan Liu 3 Oct 08, 2022
Omnidirectional camera calibration in python

Omnidirectional Camera Calibration Key features pure python initial solution based on A Toolbox for Easily Calibrating Omnidirectional Cameras (Davide

Thomas Pönitz 12 Nov 22, 2022
Code release for Convolutional Two-Stream Network Fusion for Video Action Recognition

Convolutional Two-Stream Network Fusion for Video Action Recognition

Christoph Feichtenhofer 676 Dec 31, 2022