Official Implementation of DAFormer: Improving Network Architectures and Training Strategies for Domain-Adaptive Semantic Segmentation

Related tags

Deep LearningDAFormer
Overview

DAFormer: Improving Network Architectures and Training Strategies for Domain-Adaptive Semantic Segmentation

[Arxiv] [Paper]

As acquiring pixel-wise annotations of real-world images for semantic segmentation is a costly process, a model can instead be trained with more accessible synthetic data and adapted to real images without requiring their annotations. This process is studied in Unsupervised Domain Adaptation (UDA).

Even though a large number of methods propose new UDA strategies, they are mostly based on outdated network architectures. In this work, we particularly study the influence of the network architecture on UDA performance and propose DAFormer, a network architecture tailored for UDA. It consists of a Transformer encoder and a multi-level context-aware feature fusion decoder.

DAFormer is enabled by three simple but crucial training strategies to stabilize the training and to avoid overfitting the source domain: While the Rare Class Sampling on the source domain improves the quality of pseudo-labels by mitigating the confirmation bias of self-training towards common classes, the Thing-Class ImageNet Feature Distance and a Learning Rate Warmup promote feature transfer from ImageNet pretraining.

DAFormer significantly improves the state-of-the-art performance by 10.8 mIoU for GTA→Cityscapes and by 5.4 mIoU for Synthia→Cityscapes and enables learning even difficult classes such as train, bus, and truck well.

UDA over time

The strengths of DAFormer, compared to the previous state-of-the-art UDA method ProDA, can also be observed in qualitative examples from the Cityscapes validation set.

Demo Color Palette

For more information on DAFormer, please check our [Paper].

If you find this project useful in your research, please consider citing:

@article{hoyer2021daformer,
  title={DAFormer: Improving Network Architectures and Training Strategies for Domain-Adaptive Semantic Segmentation},
  author={Hoyer, Lukas and Dai, Dengxin and Van Gool, Luc},
  journal={arXiv preprint arXiv:2111.14887},
  year={2021}
}

Setup Environment

For this project, we used python 3.8.5. We recommend setting up a new virtual environment:

python -m venv ~/venv/daformer
source ~/venv/daformer/bin/activate

In that environment, the requirements can be installed with:

pip install -r requirements.txt -f https://download.pytorch.org/whl/torch_stable.html
pip install mmcv-full==1.3.7  # requires the other packages to be installed first

Further, please download the MiT weights and a pretrained DAFormer using the following script. If problems occur with the automatic download, please follow the instructions for a manual download within the script.

sh tools/download_checkpoints.sh

All experiments were executed on a NVIDIA RTX 2080 Ti.

Inference Demo

Already as this point, the provided DAFormer model (downloaded by tools/download_checkpoints.sh) can be applied to a demo image:

python -m demo.image_demo demo/demo.png work_dirs/211108_1622_gta2cs_daformer_s0_7f24c/211108_1622_gta2cs_daformer_s0_7f24c.json work_dirs/211108_1622_gta2cs_daformer_s0_7f24c/latest.pth

When judging the predictions, please keep in mind that DAFormer had no access to real-world labels during the training.

Setup Datasets

Cityscapes: Please, download leftImg8bit_trainvaltest.zip and gt_trainvaltest.zip from here and extract them to data/cityscapes.

GTA: Please, download all image and label packages from here and extract them to data/gta.

Synthia: Please, download SYNTHIA-RAND-CITYSCAPES from here and extract it to data/synthia.

The final folder structure should look like this:

DAFormer
├── ...
├── data
│   ├── cityscapes
│   │   ├── leftImg8bit
│   │   │   ├── train
│   │   │   ├── val
│   │   ├── gtFine
│   │   │   ├── train
│   │   │   ├── val
│   ├── gta
│   │   ├── images
│   │   ├── labels
│   ├── synthia
│   │   ├── RGB
│   │   ├── GT
│   │   │   ├── LABELS
├── ...

Data Preprocessing: Finally, please run the following scripts to convert the label IDs to the train IDs and to generate the class index for RCS:

python tools/convert_datasets/gta.py data/gta --nproc 8
python tools/convert_datasets/cityscapes.py data/cityscapes --nproc 8
python tools/convert_datasets/synthia.py data/synthia/ --nproc 8

Training

For convenience, we provide an annotated config file of the final DAFormer. A training job can be launched using:

python run_experiments.py --config configs/daformer/gta2cs_uda_warm_fdthings_rcs_croppl_a999_daformer_mitb5_s0.py

For the experiments in our paper (e.g. network architecture comparison, component ablations, ...), we use a system to automatically generate and train the configs:

python run_experimenty.py --exp <ID>

More information about the available experiments and their assigned IDs, can be found in experiments.py. The generated configs will be stored in configs/generated/.

Testing & Predictions

The provided DAFormer checkpoint trained on GTA->Cityscapes (already downloaded by tools/download_checkpoints.sh) can be tested on the Cityscapes validation set using:

sh test.sh work_dirs/211108_1622_gta2cs_daformer_s0_7f24c

The predictions are saved for inspection to work_dirs/211108_1622_gta2cs_daformer_s0_7f24c/preds and the mIoU of the model is printed to the console. The provided checkpoint should achieve 68.85 mIoU. Refer to the end of work_dirs/211108_1622_gta2cs_daformer_s0_7f24c/20211108_164105.log for more information such as the class-wise IoU.

Similarly, also other models can be tested after the training has finished:

sh test.sh path/to/checkpoint_directory

Framework Structure

This project is based on mmsegmentation version 0.16.0. For more information about the framework structure and the config system, please refer to the mmsegmentation documentation and the mmcv documentation.

The most relevant files for DAFormer are:

Acknowledgements

This project is based on the following open-source projects. We thank their authors for making the source code publically available.

Owner
Lukas Hoyer
Doctoral student at ETH Zurich
Lukas Hoyer
Python版OpenCVのTracking APIのサンプルです。DaSiamRPNアルゴリズムまで対応しています。

OpenCV-Object-Tracker-Sample Python版OpenCVのTracking APIのサンプルです。   Requirement opencv-contrib-python 4.5.3.56 or later Algorithm 2021/07/16時点でOpenCVには以

KazuhitoTakahashi 36 Jan 01, 2023
An open software package to develop BCI based brain and cognitive computing technology for recognizing user's intention using deep learning

An open software package to develop BCI based brain and cognitive computing technology for recognizing user's intention using deep learning

deepbci 272 Jan 08, 2023
The implement of papar "Enhanced Graph Learning for Collaborative Filtering via Mutual Information Maximization"

SIGIR2021-EGLN The implement of paper "Enhanced Graph Learning for Collaborative Filtering via Mutual Information Maximization" Neural graph based Col

15 Dec 27, 2022
Aesara is a Python library that allows one to define, optimize, and efficiently evaluate mathematical expressions involving multi-dimensional arrays.

Aesara is a Python library that allows one to define, optimize, and efficiently evaluate mathematical expressions involving multi-dimensional arrays.

Aesara 898 Jan 07, 2023
Medical Image Segmentation using Squeeze-and-Expansion Transformers

Medical Image Segmentation using Squeeze-and-Expansion Transformers Introduction This repository contains the code of the IJCAI'2021 paper 'Medical Im

askerlee 172 Dec 20, 2022
Text Summarization - WCN — Weighted Contextual N-gram method for evaluation of Text Summarization

Text Summarization WCN — Weighted Contextual N-gram method for evaluation of Text Summarization In this project, I fine tune T5 model on Extreme Summa

Aditya Shah 1 Jan 03, 2022
Learning-Augmented Dynamic Power Management

Learning-Augmented Dynamic Power Management This repository contains source code accompanying paper Learning-Augmented Dynamic Power Management with M

Adam 0 Feb 22, 2022
Implementation of H-Transformer-1D, Hierarchical Attention for Sequence Learning

H-Transformer-1D Implementation of H-Transformer-1D, Transformer using hierarchical Attention for sequence learning with subquadratic costs. For now,

Phil Wang 123 Nov 17, 2022
Official code for paper "ISNet: Costless and Implicit Image Segmentation for Deep Classifiers, with Application in COVID-19 Detection"

Official code for paper "ISNet: Costless and Implicit Image Segmentation for Deep Classifiers, with Application in COVID-19 Detection". LRPDenseNet.py

Pedro Ricardo Ariel Salvador Bassi 2 Sep 21, 2022
Multi-Agent Reinforcement Learning (MARL) method to learn scalable control polices for multi-agent target tracking.

scalableMARL Scalable Reinforcement Learning Policies for Multi-Agent Control CD. Hsu, H. Jeong, GJ. Pappas, P. Chaudhari. "Scalable Reinforcement Lea

Christopher Hsu 17 Nov 17, 2022
Scalable Attentive Sentence-Pair Modeling via Distilled Sentence Embedding (AAAI 2020) - PyTorch Implementation

Scalable Attentive Sentence-Pair Modeling via Distilled Sentence Embedding PyTorch implementation for the Scalable Attentive Sentence-Pair Modeling vi

Microsoft 25 Dec 02, 2022
Llvlir - Low Level Variable Length Intermediate Representation

Low Level Variable Length Intermediate Representation Low Level Variable Length

Michael Clark 2 Jan 24, 2022
Combining Reinforcement Learning and Constraint Programming for Combinatorial Optimization

Hybrid solving process for combinatorial optimization problems Combinatorial optimization has found applications in numerous fields, from aerospace to

117 Dec 13, 2022
[2021][ICCV][FSNet] Full-Duplex Strategy for Video Object Segmentation

Full-Duplex Strategy for Video Object Segmentation (ICCV, 2021) Authors: Ge-Peng Ji, Keren Fu, Zhe Wu, Deng-Ping Fan*, Jianbing Shen, & Ling Shao This

Daniel-Ji 55 Dec 22, 2022
The deployment framework aims to provide a simple, lightweight, fast integrated, pipelined deployment framework that ensures reliability, high concurrency and scalability of services.

savior是一个能够进行快速集成算法模块并支持高性能部署的轻量开发框架。能够帮助将团队进行快速想法验证(PoC),避免重复的去github上找模型然后复现模型;能够帮助团队将功能进行流程拆解,很方便的提高分布式执行效率;能够有效减少代码冗余,减少不必要负担。

Tao Luo 125 Dec 22, 2022
Implementation of paper "Graph Condensation for Graph Neural Networks"

GCond A PyTorch implementation of paper "Graph Condensation for Graph Neural Networks" Code will be released soon. Stay tuned :) Abstract We propose a

Wei Jin 66 Dec 04, 2022
Tiny Object Detection in Aerial Images.

AI-TOD AI-TOD is a dataset for tiny object detection in aerial images. [Paper] [Dataset] Description AI-TOD comes with 700,621 object instances for ei

jwwangchn 116 Dec 30, 2022
Resilience from Diversity: Population-based approach to harden models against adversarial attacks

Resilience from Diversity: Population-based approach to harden models against adversarial attacks Requirements To install requirements: pip install -r

0 Nov 23, 2021
Elegy is a framework-agnostic Trainer interface for the Jax ecosystem.

Elegy Elegy is a framework-agnostic Trainer interface for the Jax ecosystem. Main Features Easy-to-use: Elegy provides a Keras-like high-level API tha

435 Dec 30, 2022
Efficient and Scalable Physics-Informed Deep Learning and Scientific Machine Learning on top of Tensorflow for multi-worker distributed computing

Notice: Support for Python 3.6 will be dropped in v.0.2.1, please plan accordingly! Efficient and Scalable Physics-Informed Deep Learning Collocation-

tensordiffeq 74 Dec 09, 2022