Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation

Related tags

Deep LearningLIID
Overview

Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation

This paper has been accepted and early accessed in IEEE TPAMI 2020.

Code contact e-mail: Yu-Huan Wu (wuyuhuan (at) mail(dot)nankai(dot)edu(dot)cn)

Introduction

Weakly supervised semantic instance segmentation with only image-level supervision, instead of relying on expensive pixel-wise masks or bounding box annotations, is an important problem to alleviate the data-hungry nature of deep learning. In this paper, we tackle this challenging problem by aggregating the image-level information of all training images into a large knowledge graph and exploiting semantic relationships from this graph. Specifically, our effort starts with some generic segment-based object proposals (SOP) without category priors. We propose a multiple instance learning (MIL) framework, which can be trained in an end-to-end manner using training images with image-level labels. For each proposal, this MIL framework can simultaneously compute probability distributions and category-aware semantic features, with which we can formulate a large undirected graph. The category of background is also included in this graph to remove the massive noisy object proposals. An optimal multi-way cut of this graph can thus assign a reliable category label to each proposal. The denoised SOP with assigned category labels can be viewed as pseudo instance segmentation of training images, which are used to train fully supervised models. The proposed approach achieves state-of-the-art performance for both weakly supervised instance segmentation and semantic segmentation.

Citations

If you are using the code/model/data provided here in a publication, please consider citing:

@article{liu2020leveraging,
  title={Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation},
  author={Yun Liu and Yu-Huan Wu and Peisong Wen and Yujun Shi and Yu Qiu and Ming-Ming Cheng},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
  year={2020},
  doi={10.1109/TPAMI.2020.3023152},
  publisher={IEEE}
}

Requirements

  • Python 3.5, PyTorch 0.4.1, Torchvision 0.2.2.post3, CUDA 9.0
  • Validated on Ubuntu 16.04, NVIDIA TITAN Xp

Testing LIID

  1. Clone the LIID repository

    git clone https://github.com/yun-liu/LIID.git
    
  2. Download the pretrained model of the MIL framework, and put them into $ROOT_DIR folder.

  3. Download the Pascal VOC2012 dataset. Extract the dataset files into $VOC2012_ROOT folder.

  4. Download the segment-based object proposals, and extract the data into $VOC2012_ROOT/proposals/ folder.

  5. Download the compiled binary files, and put the binary files into $ROOT_DIR/cut/multiway_cut/.

  6. Change the path in cut/run.sh to your own project root.

  7. run ./make.sh to build CUDA dependences.

  8. Run python3 gen_proposals.py. Remember to change the voc-root to your own $VOC2012_ROOT. The proposals with labels will be generated in the $ROOT_DIR/proposals folder.

Pretrained Models and data

The pretrained model of the MIL framework can be downloaded here.

The Pascal VOC2012 dataset can be downloaded here or other mirror websites.

S4Net proposals used for testing can be downloaded here.

The 24K simple ImageNet data (including S4Net proposals) can be downloaded here.

MCG proposals can be downloaded here.

Training with Pseudo Labels

For instance segmentation, you can use official or popular public Mask R-CNN projects like mmdetecion, Detectron2, maskrcnn-benchmark, or other popular open-source projects.

For semantic segmentation, you can use official Caffe implementation of deeplab, third-party PyTorch implementation here, or third-party Tensorflow Implementation here.

Precomputed Results

Results of instance segmentation on the Pascal VOC2012 segmentation val split can be downloaded here.

Results of semantic segmentation trained with 10K images, 10K images + 24K simple ImageNet images, 10K images (Res2Net-101) on the Pascal VOC2012 segmentation val split can be downloaded here.

Other Notes

Since it is difficult to install and configure IBM CPLEX, for convenience, we provide the compiled binary file which can run directly. If you desire to get the complete source code for solving the multi-way cut and ensure that there is no commercial use of it, please contact Yu-Huan Wu (wuyuhuan (at) mail(dot)nankai(dot)edu(dot)cn).

Acknowledgment

This code is based on IBM CPLEX. Thanks to the IBM CPLEX academic version.

Owner
Yun Liu
PhD student, Nankai University, China
Yun Liu
LeetCode Solutions https://t.me/tenvlad

leetcode LeetCode Solutions groupped by common patterns YouTube: https://www.youtube.com/c/vladten Telegram: https://t.me/nilinterface Problems source

Vlad Ten 158 Dec 29, 2022
Tensors and Dynamic neural networks in Python with strong GPU acceleration

PyTorch is a Python package that provides two high-level features: Tensor computation (like NumPy) with strong GPU acceleration Deep neural networks b

61.4k Jan 04, 2023
Calibrate your listeners! Robust communication-based training for pragmatic speakers. Findings of EMNLP 2021.

Calibrate your listeners! Robust communication-based training for pragmatic speakers Rose E. Wang, Julia White, Jesse Mu, Noah D. Goodman Findings of

Rose E. Wang 3 Apr 02, 2022
A collection of semantic image segmentation models implemented in TensorFlow

A collection of semantic image segmentation models implemented in TensorFlow. Contains data-loaders for the generic and medical benchmark datasets.

bobby 16 Dec 06, 2019
FaceAPI: AI-powered Face Detection & Rotation Tracking, Face Description & Recognition, Age & Gender & Emotion Prediction for Browser and NodeJS using TensorFlow/JS

FaceAPI AI-powered Face Detection & Rotation Tracking, Face Description & Recognition, Age & Gender & Emotion Prediction for Browser and NodeJS using

Vladimir Mandic 395 Dec 29, 2022
Bot developed in Python that automates races in pegaxy.

español | português About it: This is a fork from pega-racing-bot. This bot, developed in Python, is to automate races in pegaxy. The game developers

4 Apr 08, 2022
Code for paper "Learning to Reweight Examples for Robust Deep Learning"

learning-to-reweight-examples Code for paper Learning to Reweight Examples for Robust Deep Learning. [arxiv] Environment We tested the code on tensorf

Uber Research 261 Jan 01, 2023
BMN: Boundary-Matching Network

BMN: Boundary-Matching Network A pytorch-version implementation codes of paper: "BMN: Boundary-Matching Network for Temporal Action Proposal Generatio

qinxin 260 Dec 06, 2022
An ever-growing playground of notebooks showcasing CLIP's impressive zero-shot capabilities.

Playground for CLIP-like models Demo Colab Link GradCAM Visualization Naive Zero-shot Detection Smarter Zero-shot Detection Captcha Solver Changelog 2

Kevin Zakka 101 Dec 30, 2022
This repository includes the code of the sequence-to-sequence model for discontinuous constituent parsing described in paper Discontinuous Grammar as a Foreign Language.

Discontinuous Grammar as a Foreign Language This repository includes the code of the sequence-to-sequence model for discontinuous constituent parsing

Daniel Fernández-González 2 Apr 07, 2022
DeepI2I: Enabling Deep Hierarchical Image-to-Image Translation by Transferring from GANs

DeepI2I: Enabling Deep Hierarchical Image-to-Image Translation by Transferring from GANs Abstract: Image-to-image translation has recently achieved re

yaxingwang 23 Apr 14, 2022
Python scripts performing class agnostic object localization using the Object Localization Network model in ONNX.

ONNX Object Localization Network Python scripts performing class agnostic object localization using the Object Localization Network model in ONNX. Ori

Ibai Gorordo 15 Oct 14, 2022
IAUnet: Global Context-Aware Feature Learning for Person Re-Identification

IAUnet This repository contains the code for the paper: IAUnet: Global Context-Aware Feature Learning for Person Re-Identification Ruibing Hou, Bingpe

30 Jul 14, 2022
基于Flask开发后端、VUE开发前端框架,在WEB端部署YOLOv5目标检测模型

基于Flask开发后端、VUE开发前端框架,在WEB端部署YOLOv5目标检测模型

37 Jan 01, 2023
PyVideoAI: Action Recognition Framework

This reposity contains official implementation of: Capturing Temporal Information in a Single Frame: Channel Sampling Strategies for Action Recognitio

Kiyoon Kim 22 Dec 29, 2022
Code for the ICASSP-2021 paper: Continuous Speech Separation with Conformer.

Continuous Speech Separation with Conformer Introduction We examine the use of the Conformer architecture for continuous speech separation. Conformer

Sanyuan Chen (陈三元) 81 Nov 28, 2022
Python-kafka-reset-consumergroup-offset-example - Python Kafka reset consumergroup offset example

Python Kafka reset consumergroup offset example This is a simple example of how

Willi Carlsen 1 Feb 16, 2022
Segmentation and Identification of Vertebrae in CT Scans using CNN, k-means Clustering and k-NN

Segmentation and Identification of Vertebrae in CT Scans using CNN, k-means Clustering and k-NN If you use this code for your research, please cite ou

41 Dec 08, 2022
Retrieval.pytorch - The code we used in [2020 DIGIX]

Retrieval.pytorch - The code we used in [2020 DIGIX]

Guo-Hua Wang 2 Feb 07, 2022
MPRNet-Cloud-removal: Progressive cloud removal

MPRNet-Cloud-removal Progressive cloud removal Requirements 1.Pytorch = 1.0 2.Python 3 3.NVIDIA GPU + CUDA 9.0 4.Tensorboard Installation 1.Clone the

Semi 95 Dec 18, 2022