DeepCO3: Deep Instance Co-segmentation by Co-peak Search and Co-saliency

Overview

[CVPR19] DeepCO3: Deep Instance Co-segmentation by Co-peak Search and Co-saliency (Oral paper)

Authors: Kuang-Jui Hsu, Yen-Yu Lin, Yung-Yu Chuang

Abstract

In this paper, we address a new task called instance cosegmentation. Given a set of images jointly covering object instances of a specific category, instance co-segmentation aims to identify all of these instances and segment each of them, i.e. generating one mask for each instance. This task is important since instance-level segmentation is preferable for humans and many vision applications. It is also challenging because no pixel-wise annotated training data are available and the number of instances in each image is unknown. We solve this task by dividing it into two sub-tasks, co-peak search and instance mask segmentation. In the former sub-task, we develop a CNN-based network to detect the co-peaks as well as co-saliency maps for a pair of images. A co-peak has two endpoints, one in each image, that are local maxima in the response maps and similar to each other. Thereby, the two endpoints are potentially covered by a pair of instances of the same category. In the latter subtask, we design a ranking function that takes the detected co-peaks and co-saliency maps as inputs and can select the object proposals to produce the final results. Our method for instance co-segmentation and its variant for object colocalization are evaluated on four datasets, and achieve favorable performance against the state-of-the-art methods.

Examples

Two examples of instance co-segmentation on categories bird and sheep, respectively. An instance here refers to an object appearing in an image. In each example, the top row gives the input images while the bottom row shows the instances segmented by our method. The instance-specific coloring indicates that our method produces a segmentation mask for each instance.

Overview of our method

The proposed method contains two stages, co-peak search within the blue-shaded background and instance mask segmentation within the red-shaded background. For searching co-peaks in a pair of images, our model extracts image features, estimates their co-saliency maps, and performs feature correlation for co-peak localization. The model is optimized by three losses, including the co-peak loss, the affinity loss, and the saliency loss. For instance mask segmentation, we design a ranking function taking the detected co-peaks, the co-saliency maps, and the object proposals as inputs, and select the top-ranked proposal for each detected instance.

Results

  • Instance co-segmentation

The performance of instance co-segmentation on the four collected datasets is shown. The numbers in red and green show the best and the second best results, respectively. The column “trained” indicates whether additional training data are used.

  • Object co-localization

The performance of object co-localization on the four datasets is shown. The numbers in red and green indicate the best and the second best results, respectively. The column “trained” indicates whether additional training data are used.

Please cite our paper if this code is useful for your research.


@inproceedings{HsuCVPR19,
  author = {Kuang-Jui Hsu and Yen-Yu Lin and Yung-Yu Chuang},
  booktitle = {IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR)},
  title = {DeepCO$^3$: Deep Instance Co-segmentation by Co-peak Search and Co-saliency Detection},
  year = {2019}
}

Codes for DeepCO3

Demo for all stages: "RunDeepInstCoseg.m"

  • Including all files in "Lib" (Downloading MatConvnet is not necessary)
  • May be slightly different from the ones in paper because of the randdom seeds

Datasets (about 34 GB):

  • Including four collected datasets
  • Containing the images, ground-truth masks, salinecy maps and object proposals
  • GoogleDrive

Results reported in the papers (about 4 GB):

Download Codes from GoogleDrive :


Errata:

  • Thank Howard Yu-Chun Lo for pointing the typo in Eq. (4). The corrected one is listed in the following:

Owner
Kuang-Jui Hsu
Kuang-Jui Hsu
The devkit of the nuScenes dataset.

nuScenes devkit Welcome to the devkit of the nuScenes and nuImages datasets. Overview Changelog Devkit setup nuImages nuImages setup Getting started w

Motional 1.6k Jan 05, 2023
A clear, concise, simple yet powerful and efficient API for deep learning.

The Gluon API Specification The Gluon API specification is an effort to improve speed, flexibility, and accessibility of deep learning technology for

Gluon API 2.3k Dec 17, 2022
PyTorch implementation of MoCo v3 for self-supervised ResNet and ViT.

MoCo v3 for Self-supervised ResNet and ViT Introduction This is a PyTorch implementation of MoCo v3 for self-supervised ResNet and ViT. The original M

Facebook Research 887 Jan 08, 2023
This is an official implementation for "AS-MLP: An Axial Shifted MLP Architecture for Vision".

AS-MLP architecture for Image Classification Model Zoo Image Classification on ImageNet-1K Network Resolution Top-1 (%) Params FLOPs Throughput (image

SVIP Lab 106 Dec 12, 2022
Mouse Brain in the Model Zoo

Deep Neural Mouse Brain Modeling This is the repository for the ongoing deep neural mouse modeling project, an attempt to characterize the representat

Colin Conwell 15 Aug 22, 2022
Official source code to CVPR'20 paper, "When2com: Multi-Agent Perception via Communication Graph Grouping"

When2com: Multi-Agent Perception via Communication Graph Grouping This is the PyTorch implementation of our paper: When2com: Multi-Agent Perception vi

34 Nov 09, 2022
A project to make Amazon Echo respond to sign language using your webcam

Making Alexa respond to Sign Language using Tensorflow.js Try the live demo Read the Blog Post on Tensorflow's Blog Coming Soon Watch the video This p

Abhishek Singh 444 Jan 03, 2023
Latex code for making neural networks diagrams

PlotNeuralNet Latex code for drawing neural networks for reports and presentation. Have a look into examples to see how they are made. Additionally, l

Haris Iqbal 18.6k Jan 01, 2023
Vision Deep-Learning using Tensorflow, Keras.

Welcome! I am a computer vision deep learning developer working in Korea. This is my blog, and you can see everything I've studied here. https://www.n

kimminjun 6 Dec 14, 2022
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams

Adversarial Robustness Toolbox (ART) is a Python library for Machine Learning Security. ART provides tools that enable developers and researchers to defend and evaluate Machine Learning models and ap

3.4k Jan 04, 2023
Applying PVT to Semantic Segmentation

Applying PVT to Semantic Segmentation Here, we take MMSegmentation v0.13.0 as an example, applying PVTv2 to SemanticFPN. For details see Pyramid Visio

35 Nov 30, 2022
Learning Continuous Image Representation with Local Implicit Image Function

LIIF This repository contains the official implementation for LIIF introduced in the following paper: Learning Continuous Image Representation with Lo

Yinbo Chen 1k Dec 25, 2022
Boundary IoU API (Beta version)

Boundary IoU API (Beta version) Bowen Cheng, Ross Girshick, Piotr Dollár, Alexander C. Berg, Alexander Kirillov [arXiv] [Project] [BibTeX] This API is

Bowen Cheng 177 Dec 29, 2022
simple_pytorch_example project is a toy example of a python script that instantiates and trains a PyTorch neural network on the FashionMNIST dataset

simple_pytorch_example project is a toy example of a python script that instantiates and trains a PyTorch neural network on the FashionMNIST dataset

Ramón Casero 1 Jan 07, 2022
LiDAR Distillation: Bridging the Beam-Induced Domain Gap for 3D Object Detection

LiDAR Distillation Paper | Model LiDAR Distillation: Bridging the Beam-Induced Domain Gap for 3D Object Detection Yi Wei, Zibu Wei, Yongming Rao, Jiax

Yi Wei 75 Dec 22, 2022
Implementation of Heterogeneous Graph Attention Network

HetGAN Implementation of Heterogeneous Graph Attention Network This is the code repository of paper "Prediction of Metro Ridership During the COVID-19

5 Dec 28, 2021
Cleaned up code for DSTC 10: SIMMC 2.0 track: subtask 2: multimodal coreference resolution

UNITER-Based Situated Coreference Resolution with Rich Multimodal Input: arXiv MMCoref_cleaned Code for the MMCoref task of the SIMMC 2.0 dataset. Pre

Yichen (William) Huang 2 Dec 05, 2022
TorchCV: A PyTorch-Based Framework for Deep Learning in Computer Vision

TorchCV: A PyTorch-Based Framework for Deep Learning in Computer Vision @misc{you2019torchcv, author = {Ansheng You and Xiangtai Li and Zhen Zhu a

Donny You 2.2k Jan 06, 2023
Back to Basics: Efficient Network Compression via IMP

Back to Basics: Efficient Network Compression via IMP Authors: Max Zimmer, Christoph Spiegel, Sebastian Pokutta This repository contains the code to r

IOL Lab @ ZIB 1 Nov 19, 2021
official Pytorch implementation of ICCV 2021 paper FuseFormer: Fusing Fine-Grained Information in Transformers for Video Inpainting.

FuseFormer: Fusing Fine-Grained Information in Transformers for Video Inpainting By Rui Liu, Hanming Deng, Yangyi Huang, Xiaoyu Shi, Lewei Lu, Wenxiu

77 Dec 27, 2022