[CVPR'22] Weakly Supervised Semantic Segmentation by Pixel-to-Prototype Contrast

Related tags

Deep Learningwseg
Overview

wseg

Overview

The Pytorch implementation of Weakly Supervised Semantic Segmentation by Pixel-to-Prototype Contrast.

[arXiv]

Though image-level weakly supervised semantic segmentation (WSSS) has achieved great progress with Class Activation Maps (CAMs) as the cornerstone, the large supervision gap between classification and segmentation still hampers the model to generate more complete and precise pseudo masks for segmentation. In this study, we propose weakly-supervised pixel-to-prototype contrast that can provide pixel-level supervisory signals to narrow the gap. Guided by two intuitive priors, our method is executed across different views and within per single view of an image, aiming to impose cross-view feature semantic consistency regularization and facilitate intra(inter)-class compactness(dispersion) of the feature space. Our method can be seamlessly incorporated into existing WSSS models without any changes to the base networks and does not incur any extra inference burden. Extensive experiments manifest that our method consistently improves two strong baselines by large margins, demonstrating the effectiveness.

图片

Prerequisites

Preparation

  1. Clone this repository.
  2. Data preparation. Download PASCAL VOC 2012 devkit following instructions in http://host.robots.ox.ac.uk/pascal/VOC/voc2012/#devkit. It is suggested to make a soft link toward downloaded dataset. Then download the annotation of VOC 2012 trainaug set (containing 10582 images) from https://www.dropbox.com/s/oeu149j8qtbs1x0/SegmentationClassAug.zip?dl=0 and place them all as VOC2012/SegmentationClassAug/xxxxxx.png. Download the image-level labels cls_label.npy from https://github.com/YudeWang/SEAM/tree/master/voc12/cls_label.npy and place it into voc12/, or you can generate it by yourself.
  3. Download ImageNet pretrained backbones. We use ResNet-38 for initial seeds generation and ResNet-101 for segmentation training. Download pretrained ResNet-38 from https://drive.google.com/file/d/15F13LEL5aO45JU-j45PYjzv5KW5bn_Pn/view. The ResNet-101 can be downloaded from https://download.pytorch.org/models/resnet101-5d3b4d8f.pth.

Model Zoo

Download the trained models and category performance below.

baseline model train(mIoU) val(mIoU) test (mIoU) checkpoint category performance
SEAM contrast 61.5 58.4 - [download]
affinitynet 69.2 - [download]
deeplabv1 - 67.7* 67.4* [download] [link]
EPS contrast 70.5 - - [download]
deeplabv1 - 72.3* 73.5* [download] [link]
deeplabv2 - 72.6* 73.6* [download] [link]

* indicates using densecrf.

The training results including initial seeds, intermediate products and pseudo masks can be found here.

Usage

Step1: Initial Seed Generation with Contrastive Learning.

  1. Contrast train.

    python contrast_train.py  \
      --weights $pretrained_model \
      --voc12_root VOC2012 \
      --session_name $your_session_name \
      --batch_size $bs
    
  2. Contrast inference.

    Download the pretrained model from https://1drv.ms/u/s!AgGL9MGcRHv0mQSKoJ6CDU0cMjd2?e=dFlHgN or train from scratch, set --weights and then run:

    python contrast_infer.py \
      --weights $contrast_weight \ 
      --infer_list $[voc12/val.txt | voc12/train.txt | voc12/train_aug.txt] \
      --out_cam $your_cam_npy_dir \
      --out_cam_pred $your_cam_png_dir \
      --out_crf $your_crf_png_dir
    
  3. Evaluation.

    Following SEAM, we recommend you to use --curve to select an optimial background threshold.

    python eval.py \
      --list VOC2012/ImageSets/Segmentation/$[val.txt | train.txt] \
      --predict_dir $your_result_dir \
      --gt_dir VOC2012/SegmentationClass \
      --comment $your_comments \
      --type $[npy | png] \
      --curve True
    

Step2: Refine with AffinityNet.

  1. Preparation.

    Prepare the files (la_crf_dir and ha_crf_dir) needed for training AffinityNet. You can also use our processed crf outputs with alpha=4/8 from here.

    python aff_prepare.py \
      --voc12_root VOC2012 \
      --cam_dir $your_cam_npy_dir \
      --out_crf $your_crf_alpha_dir 
    
  2. AffinityNet train.

    python aff_train.py \
      --weights $pretrained_model \
      --voc12_root VOC2012 \
      --la_crf_dir $your_crf_dir_4.0 \
      --ha_crf_dir $your_crf_dir_8.0 \
      --session_name $your_session_name
    
  3. Random walk propagation & Evaluation.

    Use the trained AffinityNet to conduct RandomWalk for refining the CAMs from Step1. Trained model can be found in Model Zoo (https://1drv.ms/u/s!AgGL9MGcRHv0mQXi0SSkbUc2sl8o?e=AY7AzX).

    python aff_infer.py \
      --weights $aff_weights \
      --voc12_root VOC2012 \
      --infer_list $[voc12/val.txt | voc12/train.txt] \
      --cam_dir $your_cam_dir \
      --out_rw $your_rw_dir
    
  4. Pseudo mask generation. Generate the pseudo masks for training the DeepLab Model. Dense CRF is used in this step.

    python aff_infer.py \
      --weights $aff_weights \
      --infer_list voc12/trainaug.txt \
      --cam_dir $your_cam_dir \
      --voc12_root VOC2012 \
      --out_rw $your_rw_dir
    

Step3: Segmentation training with DeepLab

  1. Training.

    we use the segmentation repo from https://github.com/YudeWang/semantic-segmentation-codebase. Training and inference codes are available in segmentation/experiment/. Set DATA_PSEUDO_GT: $your_pseudo_label_path in config.py. Then run:

    python train.py
    
  2. Inference.

    Check test configration in config.py (ckpt path, trained model: https://1drv.ms/u/s!AgGL9MGcRHv0mQgpb3QawPCsKPe9?e=4vly0H) and val/test set selection in test.py. Then run:

    python test.py
    

    For test set evaluation, you need to download test set images and submit the segmentation results to the official voc server.

For integrating our approach into the EPS model, you can change branch to EPS via:

git checkout eps

Then conduct train or inference following instructions above. Segmentation training follows the same repo in segmentation. Trained models & processed files can be download in Model Zoo.

Acknowledgements

We sincerely thank Yude Wang for his great work SEAM in CVPR'20. We borrow codes heavly from his repositories SEAM and Segmentation. We also thank Seungho Lee for his EPS and jiwoon-ahn for his AffinityNet and IRN. Without them, we could not finish this work.

Citation

@inproceedings{du2021weakly,
  title={Weakly Supervised Semantic Segmentation by Pixel-to-Prototype Contrast},
  author={Du, Ye and Fu, Zehua and Liu, Qingjie and Wang, Yunhong},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  year={2022}
}
Owner
Ye Du
Ye Du
Differential Privacy for Heterogeneous Federated Learning : Utility & Privacy tradeoffs

Differential Privacy for Heterogeneous Federated Learning : Utility & Privacy tradeoffs In this work, we propose an algorithm DP-SCAFFOLD(-warm), whic

19 Nov 10, 2022
The first public PyTorch implementation of Attentive Recurrent Comparators

arc-pytorch PyTorch implementation of Attentive Recurrent Comparators by Shyam et al. A blog explaining Attentive Recurrent Comparators Visualizing At

Sanyam Agarwal 150 Oct 14, 2022
AugLy is a data augmentations library that currently supports four modalities (audio, image, text & video) and over 100 augmentations

AugLy is a data augmentations library that currently supports four modalities (audio, image, text & video) and over 100 augmentations. Each modality’s augmentations are contained within its own sub-l

Facebook Research 4.6k Jan 09, 2023
ICCV2021 Expert-Goal Trajectory Prediction

ICCV 2021: Where are you heading? Dynamic Trajectory Prediction with Expert Goal Examples This repository contains the code for the paper Where are yo

hz 21 Dec 12, 2022
HHP-Net: A light Heteroscedastic neural network for Head Pose estimation with uncertainty

HHP-Net: A light Heteroscedastic neural network for Head Pose estimation with uncertainty Giorgio Cantarini, Francesca Odone, Nicoletta Noceti, Federi

18 Aug 02, 2022
JDet is Object Detection Framework based on Jittor.

JDet is Object Detection Framework based on Jittor.

135 Dec 14, 2022
Predictive Modeling on Electronic Health Records(EHR) using Pytorch

Predictive Modeling on Electronic Health Records(EHR) using Pytorch Overview Although there are plenty of repos on vision and NLP models, there are ve

81 Jan 01, 2023
Official code for "Stereo Waterdrop Removal with Row-wise Dilated Attention (IROS2021)"

Stereo-Waterdrop-Removal-with-Row-wise-Dilated-Attention This repository includes official codes for "Stereo Waterdrop Removal with Row-wise Dilated A

29 Oct 01, 2022
Pytorch implementation of COIN, a framework for compression with implicit neural representations 🌸

COIN 🌟 This repo contains a Pytorch implementation of COIN: COmpression with Implicit Neural representations, including code to reproduce all experim

Emilien Dupont 104 Dec 14, 2022
Python periodic table module

elemenpy Hello! elements.py is a small Python periodic table module that is used for calling certain information about an element. Installation Instal

Eric Cheng 2 Dec 27, 2021
Zsseg.baseline - Zero-Shot Semantic Segmentation

This repo is for our paper A Simple Baseline for Zero-shot Semantic Segmentation

98 Dec 20, 2022
PESTO: Switching Point based Dynamic and Relative Positional Encoding for Code-Mixed Languages

PESTO: Switching Point based Dynamic and Relative Positional Encoding for Code-Mixed Languages Abstract NLP applications for code-mixed (CM) or mix-li

Mohsin Ali, Mohammed 1 Nov 12, 2021
An implementation of a sequence to sequence neural network using an encoder-decoder

Keras implementation of a sequence to sequence model for time series prediction using an encoder-decoder architecture. I created this post to share a

Luke Tonin 195 Dec 17, 2022
RuDOLPH: One Hyper-Modal Transformer can be creative as DALL-E and smart as CLIP

[Paper] [Хабр] [Model Card] [Colab] [Kaggle] RuDOLPH 🦌 🎄 ☃️ One Hyper-Modal Tr

Sber AI 230 Dec 31, 2022
A pytorch-based real-time segmentation model for autonomous driving

CFPNet: Channel-Wise Feature Pyramid for Real-Time Semantic Segmentation This project contains the Pytorch implementation for the proposed CFPNet: pap

342 Dec 22, 2022
The official implementation of VAENAR-TTS, a VAE based non-autoregressive TTS model.

VAENAR-TTS This repo contains code accompanying the paper "VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis". Sa

THUHCSI 138 Oct 28, 2022
This is the official source code of "BiCAT: Bi-Chronological Augmentation of Transformer for Sequential Recommendation".

BiCAT This is our TensorFlow implementation for the paper: "BiCAT: Sequential Recommendation with Bidirectional Chronological Augmentation of Transfor

John 15 Dec 06, 2022
A Pytorch implement of paper "Anomaly detection in dynamic graphs via transformer" (TADDY).

TADDY: Anomaly detection in dynamic graphs via transformer This repo covers an reference implementation for the paper "Anomaly detection in dynamic gr

Yue Tan 21 Nov 24, 2022
ICLR 2021: Pre-Training for Context Representation in Conversational Semantic Parsing

SCoRe: Pre-Training for Context Representation in Conversational Semantic Parsing This repository contains code for the ICLR 2021 paper "SCoRE: Pre-Tr

Microsoft 28 Oct 02, 2022
Official implementation of Sparse Transformer-based Action Recognition

STAR Official implementation of S parse T ransformer-based A ction R ecognition Dataset download NTU RGB+D 60 action recognition of 2D/3D skeleton fro

Chonghan_Lee 15 Nov 02, 2022