ICCV2021 - Mining Contextual Information Beyond Image for Semantic Segmentation

Related tags

Deep Learningmcibi
Overview

Introduction

The official repository for "Mining Contextual Information Beyond Image for Semantic Segmentation". Our full code has been merged into sssegmentation.

Abstract

This paper studies the context aggregation problem in semantic image segmentation. The existing researches focus on improving the pixel representations by aggregating the contextual information within individual images. Though impressive, these methods neglect the significance of the representations of the pixels of the corresponding class beyond the input image. To address this, this paper proposes to mine the contextual information beyond individual images to further augment the pixel representations. We first set up a feature memory module, which is updated dynamically during training, to store the dataset-level representations of various categories. Then, we learn class probability distribution of each pixel representation under the supervision of the ground-truth segmentation. At last, the representation of each pixel is augmented by aggregating the dataset-level representations based on the corresponding class probability distribution. Furthermore, by utilizing the stored dataset-level representations, we also propose a representation consistent learning strategy to make the classification head better address intra-class compactness and inter-class dispersion. The proposed method could be effortlessly incorporated into existing segmentation frameworks (e.g., FCN, PSPNet, OCRNet and DeepLabV3) and brings consistent performance improvements. Mining contextual information beyond image allows us to report state-of-the-art performance on various benchmarks: ADE20K, LIP, Cityscapes and COCO-Stuff.

Framework

img

Performance

COCOStuff-10k

Model Backbone Crop Size Schedule Train/Eval Set mIoU/mIoU (ms+flip) Download
DeepLabV3 R-50-D8 512x512 LR/POLICY/BS/EPOCH: 0.001/poly/16/110 train/test 38.84%/39.68% model | log
DeepLabV3 R-101-D8 512x512 LR/POLICY/BS/EPOCH: 0.001/poly/16/110 train/test 39.84%/41.49% model | log
DeepLabV3 S-101-D8 512x512 LR/POLICY/BS/EPOCH: 0.001/poly/32/150 train/test 41.18%/42.15% model | log
DeepLabV3 HRNetV2p-W48 512x512 LR/POLICY/BS/EPOCH: 0.001/poly/16/110 train/test 39.77%/41.35% model | log
DeepLabV3 ViT-Large 512x512 LR/POLICY/BS/EPOCH: 0.001/poly/16/110 train/test 44.01%/45.23% model | log

ADE20k

Model Backbone Crop Size Schedule Train/Eval Set mIoU/mIoU (ms+flip) Download
DeepLabV3 R-50-D8 512x512 LR/POLICY/BS/EPOCH: 0.01/poly/16/130 train/val 44.39%/45.95% model | log
DeepLabV3 R-101-D8 512x512 LR/POLICY/BS/EPOCH: 0.01/poly/16/130 train/val 45.66%/47.22% model | log
DeepLabV3 S-101-D8 512x512 LR/POLICY/BS/EPOCH: 0.004/poly/16/180 train/val 46.63%/47.36% model | log
DeepLabV3 HRNetV2p-W48 512x512 LR/POLICY/BS/EPOCH: 0.004/poly/16/180 train/val 45.79%/47.34% model | log
DeepLabV3 ViT-Large 512x512 LR/POLICY/BS/EPOCH: 0.01/poly/16/130 train/val 49.73%/50.99% model | log

CityScapes

Model Backbone Crop Size Schedule Train/Eval Set mIoU (ms+flip) Download
DeepLabV3 R-50-D8 512x1024 LR/POLICY/BS/EPOCH: 0.01/poly/16/440 trainval/test 79.90% model | log
DeepLabV3 R-101-D8 512x1024 LR/POLICY/BS/EPOCH: 0.01/poly/16/440 trainval/test 82.03% model | log
DeepLabV3 S-101-D8 512x1024 LR/POLICY/BS/EPOCH: 0.01/poly/16/500 trainval/test 81.59% model | log
DeepLabV3 HRNetV2p-W48 512x1024 LR/POLICY/BS/EPOCH: 0.01/poly/16/500 trainval/test 82.55% model | log

LIP

Model Backbone Crop Size Schedule Train/Eval Set mIoU/mIoU (flip) Download
DeepLabV3 R-50-D8 473x473 LR/POLICY/BS/EPOCH: 0.01/poly/32/150 train/val 53.73%/54.08% model | log
DeepLabV3 R-101-D8 473x473 LR/POLICY/BS/EPOCH: 0.01/poly/32/150 train/val 55.02%/55.42% model | log
DeepLabV3 S-101-D8 473x473 LR/POLICY/BS/EPOCH: 0.007/poly/40/150 train/val 56.21%/56.34% model | log
DeepLabV3 HRNetV2p-W48 473x473 LR/POLICY/BS/EPOCH: 0.007/poly/40/150 train/val 56.40%/56.99% model | log

Citation

If this code is useful for your research, please consider citing:

@article{jin2021mining,
  title={Mining Contextual Information Beyond Image for Semantic Segmentation},
  author={Jin, Zhenchao and Gong, Tao and Yu, Dongdong and Chu, Qi and Wang, Jian and Wang, Changhu and Shao, Jie},
  journal={arXiv preprint arXiv:2108.11819},
  year={2021}
}
Owner
student
Course on computational design, non-linear optimization, and dynamics of soft systems at UIUC.

Computational Design and Dynamics of Soft Systems · This is a repository that contains the source code for generating the lecture notes, handouts, exe

Tejaswin Parthasarathy 4 Jul 21, 2022
[NeurIPS 2021] Garment4D: Garment Reconstruction from Point Cloud Sequences

Garment4D [PDF] | [OpenReview] | [Project Page] Overview This is the codebase for our NeurIPS 2021 paper Garment4D: Garment Reconstruction from Point

Fangzhou Hong 112 Dec 23, 2022
Rotary Transformer

[中文|English] Rotary Transformer Rotary Transformer is an MLM pre-trained language model with rotary position embedding (RoPE). The RoPE is a relative

325 Jan 03, 2023
Pretraining Representations For Data-Efficient Reinforcement Learning

Pretraining Representations For Data-Efficient Reinforcement Learning Max Schwarzer, Nitarshan Rajkumar, Michael Noukhovitch, Ankesh Anand, Laurent Ch

Mila 40 Dec 11, 2022
A small library for doing fluid simulation with neural networks.

Neural Fluid Fields This is a small library for doing fluid simulation with neural fields. Check out our review paper, Neural Fields in Visual Computi

Towaki 23 Jun 23, 2022
PyTorch implementation of ENet

PyTorch-ENet PyTorch (v1.1.0) implementation of ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation, ported from the lua-torc

David Silva 333 Dec 29, 2022
Sign Language Transformers (CVPR'20)

Sign Language Transformers (CVPR'20) This repo contains the training and evaluation code for the paper Sign Language Transformers: Sign Language Trans

Necati Cihan Camgoz 164 Dec 30, 2022
Acoustic mosquito detection code with Bayesian Neural Networks

HumBugDB Acoustic mosquito detection with Bayesian Neural Networks. Extract audio or features from our large-scale dataset on Zenodo. This repository

31 Nov 28, 2022
PyTorch code for EMNLP 2021 paper: Don't be Contradicted with Anything! CI-ToD: Towards Benchmarking Consistency for Task-oriented Dialogue System

PyTorch code for EMNLP 2021 paper: Don't be Contradicted with Anything! CI-ToD: Towards Benchmarking Consistency for Task-oriented Dialogue System

Libo Qin 25 Sep 06, 2022
A Pytorch Implementation of Source Data-free Domain Adaptation for a Faster R-CNN

A Pytorch Implementation of Source Data-free Domain Adaptation for a Faster R-CNN Please follow Faster R-CNN and DAF to complete the environment confi

2 Jan 12, 2022
Fully Automatic Page Turning on Real Scores

Fully Automatic Page Turning on Real Scores This repository contains the corresponding code for our extended abstract Henkel F., Schwaiger S. and Widm

Florian Henkel 7 Jan 02, 2022
(NeurIPS 2021) Pytorch implementation of paper "Re-ranking for image retrieval and transductive few-shot classification"

SSR (NeurIPS 2021) Pytorch implementation of paper "Re-ranking for image retrieval and transductivefew-shot classification" [Paper] [Project webpage]

xshen 29 Dec 06, 2022
Official Implementation of DE-CondDETR and DELA-CondDETR in "Towards Data-Efficient Detection Transformers"

DE-DETRs By Wen Wang, Jing Zhang, Yang Cao, Yongliang Shen, and Dacheng Tao This repository is an official implementation of DE-CondDETR and DELA-Cond

Wen Wang 41 Dec 12, 2022
Multi-objective constrained optimization for energy applications via tree ensembles

Multi-objective constrained optimization for energy applications via tree ensembles

C⚙G - Imperial College London 1 Nov 19, 2021
Deep Dual Consecutive Network for Human Pose Estimation (CVPR2021)

Beanie - is an asynchronous ODM for MongoDB, based on Motor and Pydantic. It uses an abstraction over Pydantic models and Motor collections to work wi

295 Dec 29, 2022
A graphical Semi-automatic annotation tool based on labelImg and Yolov5

💕YOLOV5 semi-automatic annotation tool (Based on labelImg)

EricFang 247 Jan 05, 2023
The repository for our EMNLP 2021 paper "Finnish Dialect Identification: The Effect of Audio and Text"

Finnish Dialect Identification The repository for our EMNLP 2021 paper "Finnish Dialect Identification: The Effect of Audio and Text". We present a te

Rootroo Ltd 2 Dec 25, 2021
PGPortfolio: Policy Gradient Portfolio, the source code of "A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem"(https://arxiv.org/pdf/1706.10059.pdf).

This is the original implementation of our paper, A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem (arXiv:1706.1

Zhengyao Jiang 1.5k Dec 29, 2022
Making a music video with Wav2CLIP and VQGAN-CLIP

music2video Overview A repo for making a music video with Wav2CLIP and VQGAN-CLIP. The base code was derived from VQGAN-CLIP The CLIP embedding for au

Joel Jang | 장요엘 163 Dec 26, 2022
Keyword2Text This repository contains the code of the paper: "A Plug-and-Play Method for Controlled Text Generation"

Keyword2Text This repository contains the code of the paper: "A Plug-and-Play Method for Controlled Text Generation", if you find this useful and use

57 Dec 27, 2022