Self-supervised Augmentation Consistency for Adapting Semantic Segmentation (CVPR 2021)

Last update: Dec 21, 2022

Related tags

Overview

Self-supervised Augmentation Consistency
for Adapting Semantic Segmentation

This repository contains the official implementation of our paper:

Self-supervised Augmentation Consistency for Adapting Semantic Segmentation
Nikita Araslanov and Stefan Roth
To appear at CVPR 2021. [arXiv preprint]


We obtain state-of-the-art accuracy of adapting semantic segmentation by enforcing consistency across photometric and similarity transformations. We use neither style transfer nor adversarial training.

Contact: Nikita Araslanov fname.lname (at) visinf.tu-darmstadt.de

Installation

Requirements. To reproduce our results, we recommend Python >=3.6, PyTorch >=1.4, CUDA >=10.0. At least two Titan X GPUs (12Gb) or equivalent are required for VGG-16; ResNet-101 and VGG-16/FCN need four.

create conda environment:

conda create --name da-sac
source activate da-sac

install PyTorch >=1.4 (see PyTorch instructions). For example,

conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch

install the dependencies:

pip install -r requirements.txt

download data (Cityscapes, GTA5, SYNTHIA) and create symlinks in the ./data folder, as follows:

./data/cityscapes -> <symlink to Cityscapes>
./data/cityscapes/gtFine2/
./data/cityscapes/leftImg8bit/

./data/game -> <symlink to GTA>
./data/game/labels_cs
./data/game/images

./data/synthia  -> <symlink to SYNTHIA>
./data/synthia/labels_cs
./data/synthia/RGB

Note that all ground-truth label IDs (Cityscapes, GTA5 and SYNTHIA) should be converted to Cityscapes train IDs. The label directories in the above example (gtFine2, labels_cs) therefore refer not to the original labels, but to these converted semantic maps.

Training

Training from ImageNet initialisation proceeds in three steps:

Training the baseline (ABN)
Generating the weights for importance sampling
Training with augmentation consistency from the ABN baseline

1. Training the baseline (ABN)

Here the input are ImageNet models available from the official PyTorch repository. We provide the links to those models for convenience.

Backbone	Link
ResNet-101	resnet101-5d3b4d8f.pth (171M)
VGG-16	vgg16_bn-6c64b313.pth (528M)

By default, these models should be placed in ./models/pretrained/ (though configurable with MODEL.INIT_MODEL).

To run the training

bash ./launch/train.sh [gta|synthia] [resnet101|vgg16|vgg16fcn] base

where the first argument specifies the source domain, the second determines the network architecture. The third argument base instructs to run the training of the baseline.

If you would like to skip this step, you can use our pre-trained models:

Source domain: GTA5

Backbone	Arch.	IoU (val)	Link	MD5
ResNet-101	DeepLabv2	40.8	baseline_abn_e040.pth (336M)	`9fe17[...]c11fc`
VGG-16	DeepLabv2	37.1	baseline_abn_e115.pth (226M)	`d4ffc[...]ef755`
VGG-16	FCN	36.7	baseline_abn_e040.pth (1.1G)	`aa2e9[...]bae53`

Source domain: SYNTHIA

Backbone	Arch.	IoU (val)	Link	MD5
ResNet-101	DeepLabv2	36.3	baseline_abn_e090.pth (336M)	`b3431[...]d1a83`
VGG-16	DeepLabv2	34.4	baseline_abn_e070.pth (226M)	`3af24[...]5b24e`
VGG-16	FCN	31.6	baseline_abn_e040.pth (1.1G)	`5f457[...]e4b3a`

Tip: You can download these files (as well as the final models below) with tools/download_baselines.sh:

cp tools/download_baselines.sh snapshots/cityscapes/baselines/
cd snapshots/cityscapes/baselines/
bash ./download_baselines.sh

2. Generating weights for importance sampling

To generate the weights you need to

generate mask predictions with your baseline (see inference below);
run tools/compute_image_weights.py that reads in those predictions and counts the predictions per each class.

If you would like to skip this step, you can use our weights we computed for the ABN baselines above:

Backbone	Arch.	Source: GTA5	Source: SYNTHIA
ResNet-101	DeepLabv2	cs_weights_resnet101_gta.data	cs_weights_resnet101_synthia.data
VGG-16	DeepLabv2	cs_weights_vgg16_gta.data	cs_weights_vgg16_synthia.data
VGG-16	FCN	cs_weights_vgg16fcn_gta.data	cs_weights_vgg16fcn_synthia.data

Tip: The bash script data/download_weights.sh will download all these importance sampling weights in the current directory.

3. Training with augmentation consistency

To train the model with augmentation consistency, we use the same shell script as in step 1, but without the argument base:

bash ./launch/train.sh [gta|synthia] [resnet101|vgg16|vgg16fcn]

Make sure to specify your baseline snapshot with RESUME bash variable set in the environment (export RESUME=...) or directly in the shell script (commented out by default).

We provide our final models for download.

Source domain: GTA5

Backbone	Arch.	IoU (val)	IoU (test)	Link	MD5
ResNet-101	DeepLabv2	53.8	55.7	final_e136.pth (504M)	`59c16[...]5a32f`
VGG-16	DeepLabv2	49.8	51.0	final_e184.pth (339M)	`0accb[...]d5881`
VGG-16	FCN	49.9	50.4	final_e112.pth (1.6G)	`e69f8[...]f729b`

Source domain: SYNTHIA

Backbone	Arch.	IoU (val)	IoU (test)	Link	MD5
ResNet-101	DeepLabv2	52.6	52.7	final_e164.pth (504M)	`a7682[...]db742`
VGG-16	DeepLabv2	49.1	48.3	final_e164.pth (339M)	`c5b31[...]5fdb7`
VGG-16	FCN	46.8	45.8	final_e098.pth (1.6G)	`efb74[...]845cc`

Inference and evaluation

Inference

To run single-scale inference from your snapshot, use infer_val.py. The bash script launch/infer_val.sh provides an easy way to run the inference by specifying a few variables:

# validation/training set
FILELIST=[val_cityscapes|train_cityscapes] 
# configuration used for training
CONFIG=configs/[deeplabv2_vgg16|deeplab_resnet101|fcn_vgg16]_train.yaml
# the following 3 variables effectively specify the path to the snapshot
EXP=...
RUN_ID=...
SNAPSHOT=...
# the snapshot path is defined as
# SNAPSHOT_PATH=snapshots/cityscapes/${EXP}/${RUN_ID}/${SNAPSHOT}.pth

Evaluation

Please use the Cityscapes' official evaluation tool evalPixelLevelSemanticLabeling from Cityscapes scripts for evaluating your results.

Citation

We hope you find our work useful. If you would like to acknowledge it in your project, please use the following citation:

@inproceedings{Araslanov:2021:DASAC,
  title     = {Self-supervised Augmentation Consistency for Adapting Semantic Segmentation},
  author    = {Araslanov, Nikita and and Roth, Stefan},
  booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year = {2021}
}

Self-supervised Augmentation Consistency for Adapting Semantic Segmentation (CVPR 2021)

Related tags

Overview

Self-supervised Augmentation Consistency
for Adapting Semantic Segmentation

Installation

Training

1. Training the baseline (ABN)

2. Generating weights for importance sampling

3. Training with augmentation consistency

Inference and evaluation

Inference

Evaluation

Citation

Owner

Visual Inference Lab @TU Darmstadt

OpenL3: Open-source deep audio and image embeddings

Problem-943.-ACMP - Problem 943. ACMP

RLHive: a framework designed to facilitate research in reinforcement learning.

ATOMIC 2020: On Symbolic and Neural Commonsense Knowledge Graphs

Wind Speed Prediction using LSTMs in PyTorch

PyTorch implementation of "Dataset Knowledge Transfer for Class-Incremental Learning Without Memory" (WACV2022)

Ray tracing of a Schwarzschild black hole written entirely in TensorFlow.

Github project for Attention-guided Temporal Coherent Video Object Matting.

This is an official implementation for "PlaneRecNet".

Source code for the BMVC-2021 paper "SimReg: Regression as a Simple Yet Effective Tool for Self-supervised Knowledge Distillation".

Controlling Hill Climb Racing with Hand Tacking

Perturb-and-max-product: Sampling and learning in discrete energy-based models

AI创造营：Metaverse启动机之重构现世，结合PaddlePaddle 和 Wechaty 创造自己的聊天机器人

Pytorch based library to rank predicted bounding boxes using text/image user's prompts.

DANA paper supplementary materials

Prompts - Read a textfile of prompts and import into anki via ankiconnect

Allele-specific pipeline for unbiased read mapping(WIP), QTL discovery(WIP), and allelic-imbalance analysis

Open Source Light Field Toolbox for Super-Resolution

REGTR: End-to-end Point Cloud Correspondences with Transformers

ServiceX Transformer that converts flat ROOT ntuples into columnwise data

Self-supervised Augmentation Consistency for Adapting Semantic Segmentation (CVPR 2021)

Related tags

Overview

Self-supervised Augmentation Consistency for Adapting Semantic Segmentation

Installation

Training

1. Training the baseline (ABN)

2. Generating weights for importance sampling

3. Training with augmentation consistency

Inference and evaluation

Inference

Evaluation

Citation

Owner

Visual Inference Lab @TU Darmstadt

OpenL3: Open-source deep audio and image embeddings

Problem-943.-ACMP - Problem 943. ACMP

RLHive: a framework designed to facilitate research in reinforcement learning.

ATOMIC 2020: On Symbolic and Neural Commonsense Knowledge Graphs

Wind Speed Prediction using LSTMs in PyTorch

PyTorch implementation of "Dataset Knowledge Transfer for Class-Incremental Learning Without Memory" (WACV2022)

Ray tracing of a Schwarzschild black hole written entirely in TensorFlow.

Github project for Attention-guided Temporal Coherent Video Object Matting.

This is an official implementation for "PlaneRecNet".

Source code for the BMVC-2021 paper "SimReg: Regression as a Simple Yet Effective Tool for Self-supervised Knowledge Distillation".

Controlling Hill Climb Racing with Hand Tacking

Perturb-and-max-product: Sampling and learning in discrete energy-based models

AI创造营 ：Metaverse启动机之重构现世，结合PaddlePaddle 和 Wechaty 创造自己的聊天机器人

Pytorch based library to rank predicted bounding boxes using text/image user's prompts.

DANA paper supplementary materials

Prompts - Read a textfile of prompts and import into anki via ankiconnect

Allele-specific pipeline for unbiased read mapping(WIP), QTL discovery(WIP), and allelic-imbalance analysis

Open Source Light Field Toolbox for Super-Resolution

REGTR: End-to-end Point Cloud Correspondences with Transformers

ServiceX Transformer that converts flat ROOT ntuples into columnwise data

Self-supervised Augmentation Consistency
for Adapting Semantic Segmentation

AI创造营：Metaverse启动机之重构现世，结合PaddlePaddle 和 Wechaty 创造自己的聊天机器人