A Comprehensive Analysis of Weakly-Supervised Semantic Segmentation in Different Image Domains (IJCV submission)

Overview

wsss-analysis

The code of: A Comprehensive Analysis of Weakly-Supervised Semantic Segmentation in Different Image Domains, arXiv pre-print 2019 paper.

Introduction

We conduct the first comprehensive analysis of Weakly-Supervised Semantic Segmentation (WSSS) with image label supervision in different image domains. WSSS has been almost exclusively evaluated on PASCAL VOC2012 but little work has been done on applying to different image domains, such as histopathology and satellite images. The paper analyzes the compatibility of different methods for representative datasets and presents principles for applying to an unseen dataset.

In this repository, we provide the evaluation code used to generate the weak localization cues and final segmentations from Section 5 (Performance Evaluation) of the paper. The code release enables reproducing the results in our paper. The Keras implementation of HistoSegNet was adapted from hsn_v1; the Tensorflow implementations of SEC and DSRG were adapted from SEC-tensorflow and DSRG-tensorflow, respectively. The PyTorch implementation of IRNet was adapted from irn. Pretrained models and evaluation images are also available for download.

Citing this repository

If you find this code useful in your research, please consider citing us:

    @article{chan2019comprehensive,
        title={A Comprehensive Analysis of Weakly-Supervised Semantic Segmentation in Different Image Domains},
        author={Chan, Lyndon and Hosseini, Mahdi S. and Plataniotis, Konstantinos N.},
        journal={International Journal of Computer Vision},
        volume={},
        number={},
        pages={},
        year={2020},
        publisher={Springer}
    }

Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.

Prerequisites

Mandatory

  • python (checked on 3.5)
  • scipy (checked on 1.2.0)
  • skimage / scikit-image (checked on 0.15.0)
  • keras (checked on 2.2.4)
  • tensorflow (checked on 1.13.1)
  • tensorflow-gpu (checked on 1.13.1)
  • numpy (checked on 1.18.1)
  • pandas (checked on 0.23.4)
  • cv2 / opencv-python (checked on 3.4.4.19)
  • cython
  • imageio (checked on 2.5.0)
  • chainercv (checked on 0.12.0)
  • pydensecrf (git+https://github.com/lucasb-eyer/pydensecrf.git)
  • torch (checked on 1.1.0)
  • torchvision (checked on 0.2.2.post3)
  • tqdm

Optional

  • matplotlib (checked on 3.0.2)
  • jupyter

To utilize the code efficiently, GPU support is required. The following configurations have been tested to work successfully:

  • CUDA Version: 10
  • CUDA Driver Version: r440
  • CUDNN Version: 7.6.4 - 7.6.5 We do not guarantee proper functioning of the code using different versions of CUDA or CUDNN.

Hardware Requirements

Each method used in this repository has different GPU memory requirements. We have listed the approximate GPU memory requirements for each model through our own experiments:

  • 01_train: ~6 GB (e.g. NVIDIA RTX 2060)
  • 02_cues: ~6 GB (e.g. NVIDIA RTX 2060)
  • 03a_sec-dsrg: ~11 GB (e.g. NVIDIA GTX 2080 Ti)
  • 03b_irn: ~8 GB (e.g. NVIDIA GTX 1070)
  • 03c_hsn: ~6 GB (e.g. NVIDIA RTX 2060)

Downloading data

The pretrained models, ground-truth annotations, and images used in this paper are available on Zenodo under a Creative Commons Attribution license: DOI. Please extract the contents into your wsss-analysis\database directory. If you choose to extract the data to another directory, please modify the filepaths accordingly in settings.ini.

Note: the training-set images of ADP are released on a case-by-case basis due to the confidentiality agreement for releasing the data. To obtain access to wsss-analysis\database\ADPdevkit\ADPRelease1\JPEGImages and wsss-analysis\database\ADPdevkit\ADPRelease1\PNGImages needed for gen_cues in 01_weak_cues, apply for access separately here.

Running the code

Scripts

To run 02_cues (generate weak cues for SEC and DSRG):

cd 02_cues
python demo.py

To run 03a_sec-dsrg (train/evaluate SEC, DSRG performance in Section 5; to omit training, comment out lines 76-77 in 03a_sec-dsrg\demo.py):

cd 03a_sec-dsrg
python demo.py

To run 03b_irn (train/evaluate IRNet and Grad-CAM performance in Section 5):

cd 03b_irn
python demo_tune.py

To run 03b_irn (evaluate pre-trained Grad-CAM performance in Section 5):

cd 03b_irn
python demo_cam.py

To run 03b_irn (evaluate pre-trained IRNet performance in Section 5):

cd 03b_irn
python demo_sem_seg.py

To run 03c_hsn (evaluate HistoSegNet performance in Section 5):

cd 03c_hsn
python demo.py

Notebooks

03a_sec-dsrg:

03b_irn:

  • VGG16-IRNet on ADP-morph: (TODO)
  • VGG16-IRNet on ADP-func: (TODO)
  • VGG16-IRNet on VOC2012: (TODO)
  • VGG16-IRNet on DeepGlobe: (TODO)

03c_hsn:

Results

To access each method's evaluation results, check the associated eval (for numerical results) and out (for outputted images) folders. For easy access to all evaluated results, run scripts/extract_eval.py.

(NOTE: the numerical results obtained for SEC and DSRG DeepGlobe_balanced differ slightly from those reported in the paper due to retraining the models during code cleanup. Also, tuning is equivalent to the validation set and segtest is equivalent to the evaluation set in ADP. See hsn_v1 to replicate those results for ADP precisely.)

Network - - VGG16 - - - - X1.7/M7 - - - -
WSSS Method - - Grad-CAM SEC DSRG IRNet HistoSegNet Grad-CAM SEC DSRG IRNet HistoSegNet
Dataset Training Testing " " " " " " " " " "
ADP-morph train validation 0.14507 0.10730 0.08826 0.15068 0.13255 0.20997 0.13597 0.13458 0.21450 0.27546
ADP-morph train evaluation 0.14946 0.11409 0.08011 0.15546 0.16159 0.21426 0.13369 0.10835 0.21737 0.26156
ADP-func train validation 0.34813 0.28232 0.37193 0.35016 0.44215 0.35233 0.32216 0.28625 0.34730 0.50663
ADP-func train evaluation 0.38187 0.28097 0.44726 0.36318 0.44115 0.37910 0.30828 0.31734 0.38943 0.48020
VOC2012 train val 0.26262 0.37058 0.32129 0.31198 0.22707 0.14946 0.37629 0.35004 0.17844 0.09201
DeepGlobe training (75% test) evaluation (25% test) 0.28037 0.24005 0.28841 0.29405 0.24019 0.21260 0.24841 0.35258 0.24620 0.29398
DeepGlobe training (37.5% test) evaluation (25% test) 0.28083 0.25512 0.32017 0.29207 0.30410 0.22266 0.20050 0.26470 0.21303 0.21617

Examples

ADP-morph

ADP-func

VOC2012

DeepGlobe

TODO

  1. Improve comments and code documentation
  2. Add IRNet notebooks
  3. Clean up IRNet code
You might also like...
Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localization and Semantic Segmentation (CVPR 2022)
Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localization and Semantic Segmentation (CVPR 2022)

CCAM (Unsupervised) Code repository for our paper "CCAM: Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localizati

[CVPR'22] Weakly Supervised Semantic Segmentation by Pixel-to-Prototype Contrast
[CVPR'22] Weakly Supervised Semantic Segmentation by Pixel-to-Prototype Contrast

wseg Overview The Pytorch implementation of Weakly Supervised Semantic Segmentation by Pixel-to-Prototype Contrast. [arXiv] Though image-level weakly

Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation

Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation This paper has been accepted and early accessed

Cross-Image Region Mining with Region Prototypical Network for Weakly Supervised Segmentation
Cross-Image Region Mining with Region Prototypical Network for Weakly Supervised Segmentation

Cross-Image Region Mining with Region Prototypical Network for Weakly Supervised Segmentation The code of: Cross-Image Region Mining with Region Proto

Siamese-nn-semantic-text-similarity - A repository containing comprehensive Neural Networks based PyTorch implementations for the semantic text similarity task Synthetic Humans for Action Recognition, IJCV 2021
Synthetic Humans for Action Recognition, IJCV 2021

SURREACT: Synthetic Humans for Action Recognition from Unseen Viewpoints Gül Varol, Ivan Laptev and Cordelia Schmid, Andrew Zisserman, Synthetic Human

IJCAI2020 & IJCV 2020 :city_sunrise: Unsupervised Scene Adaptation with Memory Regularization in vivo
IJCAI2020 & IJCV 2020 :city_sunrise: Unsupervised Scene Adaptation with Memory Regularization in vivo

Seg_Uncertainty In this repo, we provide the code for the two papers, i.e., MRNet:Unsupervised Scene Adaptation with Memory Regularization in vivo, IJ

The implementation for the SportsCap (IJCV 2021)
The implementation for the SportsCap (IJCV 2021)

SportsCap: Monocular 3D Human Motion Capture and Fine-grained Understanding in Challenging Sports Videos ProjectPage | Paper | Video | Dataset (Part01

Accurate 3D Face Reconstruction with Weakly-Supervised Learning: From Single Image to Image Set (CVPRW 2019). A PyTorch implementation.
Accurate 3D Face Reconstruction with Weakly-Supervised Learning: From Single Image to Image Set (CVPRW 2019). A PyTorch implementation.

Accurate 3D Face Reconstruction with Weakly-Supervised Learning: From Single Image to Image Set —— PyTorch implementation This is an unofficial offici

Comments
  • Incorrect Axis?

    Incorrect Axis?

    I think the axis=2 is wrong in this line. The docstring says the shape should be BxHxWxC, which would make axis=2 take the argmax over the width dimension, but I think you mean to take it over the class dimension. But seeing as how your code worked using axis=2 I assume it is not a mistake in the code but rather the docstring is incorrect. I guess the inputs to the function are using HxWxC dimensions.

    opened by hasoweh 1
  • Background class DeepGlobe

    Background class DeepGlobe

    Hi, I have a quick question. Are you using a background class in your 'cues' for the DeepGlobe dataset? If so, is this class representing areas in the CAM that are below the FG threshold (20%)?

    Thanks!

    opened by hasoweh 0
Releases(v2.0)
  • v2.0(Jun 21, 2020)

    Code repository corresponding to the second version of the arXiv pre-print: [v2] Tue, 12 May 2020 04:42:47 UTC (6,209 KB). Please note that four methods are evaluated in this version (SEC, DSRG, IRNet, HistoSegNet) with Grad-CAM providing the baseline. Performance is inferior to that reported in the first version of the pre-print.

    Source code(tar.gz)
    Source code(zip)
  • v1.1(Jun 21, 2020)

    Code repository corresponding to the first version of the arXiv pre-print: [v1] Tue, 24 Dec 2019 03:00:34 UTC (8,560 KB). Please note that three methods are evaluated in this version (SEC, DSRG, and HistoSegNet) with the baseline being the thresholded weak cues from Grad-CAM. Performance is inferior to that reported in subsequent versions of the pre-print.

    Source code(tar.gz)
    Source code(zip)
Owner
Lyndon Chan
Computer Vision, Natural Language Processing, Machine Learning | Data Scientist at Alphabyte Solutions (ECE MASc'20, University of Toronto)
Lyndon Chan
Pomodoro timer that acknowledges the inexorable, infinite passage of time

Pomodouroboros Most pomodoro trackers assume you're going to start them. But time and tide wait for no one - the great pomodoro of the cosmos is cold

Glyph 66 Dec 13, 2022
This is the repository for CVPR2021 Dynamic Metric Learning: Towards a Scalable Metric Space to Accommodate Multiple Semantic Scales

Intro This is the repository for CVPR2021 Dynamic Metric Learning: Towards a Scalable Metric Space to Accommodate Multiple Semantic Scales Vehicle Sam

39 Jul 21, 2022
Pytorch implementation code for [Neural Architecture Search for Spiking Neural Networks]

Neural Architecture Search for Spiking Neural Networks Pytorch implementation code for [Neural Architecture Search for Spiking Neural Networks] (https

Intelligent Computing Lab at Yale University 28 Nov 18, 2022
Mmdet benchmark with python

mmdet_benchmark 本项目是为了研究 mmdet 推断性能瓶颈,并且对其进行优化。 配置与环境 机器配置 CPU:Intel(R) Core(TM) i9-10900K CPU @ 3.70GHz GPU:NVIDIA GeForce RTX 3080 10GB 内存:64G 硬盘:1T

杨培文 (Yang Peiwen) 24 May 21, 2022
PyDEns is a framework for solving Ordinary and Partial Differential Equations (ODEs & PDEs) using neural networks

PyDEns PyDEns is a framework for solving Ordinary and Partial Differential Equations (ODEs & PDEs) using neural networks. With PyDEns one can solve PD

Data Analysis Center 220 Dec 26, 2022
Autonomous Perception: 3D Object Detection with Complex-YOLO

Autonomous Perception: 3D Object Detection with Complex-YOLO LiDAR object detect

Thomas Dunlap 2 Feb 18, 2022
AirPose: Multi-View Fusion Network for Aerial 3D Human Pose and Shape Estimation

AirPose AirPose: Multi-View Fusion Network for Aerial 3D Human Pose and Shape Estimation Check the teaser video This repository contains the code of A

Robot Perception Group 41 Dec 05, 2022
A Lightweight Experiment & Resource Monitoring Tool 📺

Lightweight Experiment & Resource Monitoring 📺 "Did I already run this experiment before? How many resources are currently available on my cluster?"

170 Dec 28, 2022
Graph parsing approach to structured sentiment analysis.

Fine-grained Sentiment Analysis as Dependency Graph Parsing This repository contains the code and datasets described in following paper: Fine-grained

Jeremy Barnes 36 Dec 12, 2022
OpenDelta - An Open-Source Framework for Paramter Efficient Tuning.

OpenDelta is a toolkit for parameter efficient methods (we dub it as delta tuning), by which users could flexibly assign (or add) a small amount parameters to update while keeping the most paramters

THUNLP 386 Dec 26, 2022
Code for ECIR'20 paper Diagnosing BERT with Retrieval Heuristics

Bert Axioms This is the repository with the code for the Paper Diagnosing BERT with Retrieval Heuristics Required Data In order to run this code, you

Arthur Câmara 5 Jan 21, 2022
[ICLR 2021] HW-NAS-Bench: Hardware-Aware Neural Architecture Search Benchmark

HW-NAS-Bench: Hardware-Aware Neural Architecture Search Benchmark Accepted as a spotlight paper at ICLR 2021. Table of content File structure Prerequi

72 Jan 03, 2023
Learning Tracking Representations via Dual-Branch Fully Transformer Networks

Learning Tracking Representations via Dual-Branch Fully Transformer Networks DualTFR ⭐ We achieves the runner-ups for both VOT2021ST (short-term) and

phiphi 19 May 04, 2022
A practical ML pipeline for data labeling with experiment tracking using DVC.

Auto Label Pipeline A practical ML pipeline for data labeling with experiment tracking using DVC Goals: Demonstrate reproducible ML Use DVC to build a

Todd Cook 4 Mar 08, 2022
Code accompanying the paper "Knowledge Base Completion Meets Transfer Learning"

Knowledge Base Completion Meets Transfer Learning This code accompanies the paper Knowledge Base Completion Meets Transfer Learning published at EMNLP

14 Nov 27, 2022
SAT: 2D Semantics Assisted Training for 3D Visual Grounding, ICCV 2021 (Oral)

SAT: 2D Semantics Assisted Training for 3D Visual Grounding SAT: 2D Semantics Assisted Training for 3D Visual Grounding by Zhengyuan Yang, Songyang Zh

Zhengyuan Yang 22 Nov 30, 2022
A 1.3B text-to-image generation model trained on 14 million image-text pairs

minDALL-E on Conceptual Captions minDALL-E, named after minGPT, is a 1.3B text-to-image generation model trained on 14 million image-text pairs for no

Kakao Brain 604 Dec 14, 2022
Learning a mapping from images to psychological similarity spaces with neural networks.

LearningPsychologicalSpaces v0.1: v1.1: v1.2: v1.3: v1.4: v1.5: The code in this repository explores learning a mapping from images to psychological s

Lucas Bechberger 8 Dec 12, 2022
TSDF++: A Multi-Object Formulation for Dynamic Object Tracking and Reconstruction

TSDF++: A Multi-Object Formulation for Dynamic Object Tracking and Reconstruction TSDF++ is a novel multi-object TSDF formulation that can encode mult

ETHZ ASL 130 Dec 29, 2022
The repo for the paper "I3CL: Intra- and Inter-Instance Collaborative Learning for Arbitrary-shaped Scene Text Detection".

I3CL: Intra- and Inter-Instance Collaborative Learning for Arbitrary-shaped Scene Text Detection Updates | Introduction | Results | Usage | Citation |

33 Jan 05, 2023