An official implementation of the paper Exploring Sequence Feature Alignment for Domain Adaptive Detection Transformers

Related tags

Deep LearningSFA
Overview

Sequence Feature Alignment (SFA)

By Wen Wang, Yang Cao, Jing Zhang, Fengxiang He, Zheng-jun Zha, Yonggang Wen, and Dacheng Tao

This repository is an official implementation of the paper Exploring Sequence Feature Alignment for Domain Adaptive Detection Transformers, which is accepted to ACM MultiMedia 2021.

Introduction

TL; DR. We develop a domain adaptive object detection method SFA that is specialized for adaptive detection transformers. It contains a domain query-based feature alignment model and a token-wise feature alignment module for global and local feature alignment respectively, and a bipartite matching consistency loss for improving robustness.

SFA

Abstract. Detection transformers have recently shown promising object detection results and attracted increasing attention. However, how to develop effective domain adaptation techniques to improve its cross-domain performance remains unexplored and unclear. In this paper, we delve into this topic and empirically find that direct feature distribution alignment on the CNN backbone only brings limited improvements, as it does not guarantee domain-invariant sequence features in the transformer for prediction. To address this issue, we propose a novel Sequence Feature Alignment (SFA) method that is specially designed for the adaptation of detection transformers. Technically, SFA consists of a domain query-based feature alignment (DQFA) module and a token-wise feature alignment (TDA) module. In DQFA, a novel domain query is used to aggregate and align global context from the token sequence of both domains. DQFA reduces the domain discrepancy in global feature representations and object relations when deploying in the transformer encoder and decoder, respectively. Meanwhile, TDA aligns token features in the sequence from both domains, which reduces the domain gaps in local and instance-level feature representations in the transformer encoder and decoder, respectively. Besides, a novel bipartite matching consistency loss is proposed to enhance the feature discriminability for robust object detection. Experiments on three challenging benchmarks show that SFA outperforms state-of-the-art domain adaptive object detection methods.

Main Results

The experimental results and model weights for Cityscapes to Foggy Cityscapes are shown below.

Model mAP [email protected] [email protected] [email protected] [email protected] [email protected] Log & Model
SFA-DefDETR 21.5 41.1 20.0 3.9 20.9 43.0 Google Drive
SFA-DefDETR-BoxRefine 23.9 42.6 22.5 3.8 21.6 46.7 Google Drive
SFA-DefDETR-TwoStage 24.1 42.5 22.8 3.8 22.0 48.1 Google Drive

Note:

  1. All models of SFA are trained with total batch size of 4.
  2. "DefDETR" means Deformable DETR (with R50 backbone).
  3. "BoxRefine" means Deformable DETR with iterative box refinement.
  4. "TwoStage" indicates the two-stage Deformable DETR variant.
  5. The original implementation is based on our internal codebase. There are slight differences in the released code are slight differences. For example, we only use the middle features output by the first encoder and decoder layers for hierarchical feature alignment, to reduce computational costs during training.

Installation

Requirements

  • Linux, CUDA>=9.2, GCC>=5.4

  • Python>=3.7

    We recommend you to use Anaconda to create a conda environment:

    conda create -n sfa python=3.7 pip

    Then, activate the environment:

    conda activate sfa
  • PyTorch>=1.5.1, torchvision>=0.6.1 (following instructions here)

    For example, if your CUDA version is 9.2, you could install pytorch and torchvision as following:

    conda install pytorch=1.5.1 torchvision=0.6.1 cudatoolkit=9.2 -c pytorch
  • Other requirements

    pip install -r requirements/requirements.txt
  • Logging using wandb (optional)

    pip install -r requirements/optional.txt

Compiling CUDA operators

cd ./models/ops
sh ./make.sh
# unit test (should see all checking is True)
python test.py

Usage

Dataset preparation

We use the preparation of Cityscapes to Foggy Cityscapes adaptation as demonstration. Other domain adaptation benchmarks can be prepared in analog. Cityscapes and Foggy Cityscapes datasets can be downloaded from here. The annotations in COCO format can be obtained from here. Afterward, please organize the datasets and annotations as following:

[coco_path]
└─ cityscapes
   └─ leftImg8bit
      └─ train
      └─ val
└─ foggy_cityscapes
   └─ leftImg8bit_foggy
      └─ train
      └─ val
└─ CocoFormatAnnos
   └─ cityscapes_train_cocostyle.json
   └─ cityscapes_foggy_train_cocostyle.json
   └─ cityscapes_foggy_val_cocostyle.json

Training

As an example, we provide commands for training our SFA on a single node with 4 GPUs for weather adaptation.

Training SFA-DeformableDETR

GPUS_PER_NODE=4 ./tools/run_dist_launch.sh 4 ./configs_da/sfa_r50_deformable_detr.sh --wandb

Training SFA-DeformableDETR-BoxRefine

GPUS_PER_NODE=4 ./tools/run_dist_launch.sh 4 ./configs_da/sfa_r50_deformable_detr_plus_iterative_bbox_refinement.sh --wandb

Training SFA-DeformableDETR-TwoStage

GPUS_PER_NODE=4 ./tools/run_dist_launch.sh 4 ./configs_da/sfa_r50_deformable_detr_plus_iterative_bbox_refinement_plus_plus_two_stage.sh --wandb

Training Source-only DeformableDETR

Please refer to the source branch.

Evaluation

You can get the config file and pretrained model of SFA (the link is in "Main Results" session), then run following command to evaluate it on Foggy Cityscapes validation set:

<path to config file> --resume <path to pre-trained model> --eval

You can also run distributed evaluation by using ./tools/run_dist_launch.sh or ./tools/run_dist_slurm.sh.

Acknowledgement

This project is based on DETR and Deformable DETR. Thanks for their wonderful works. See LICENSE for more details.

Citing SFA

If you find SFA useful in your research, please consider citing:

@inproceedings{wang2021exploring ,
  title={Exploring Sequence Feature Alignment for Domain Adaptive Detection Transformers},
  author={Wen, Wang and Yang, Cao and Jing, Zhang and Fengxiang, He and Zheng-Jun, Zha and Yonggang, Wen and Dacheng, Tao},
  booktitle={Proceedings of the 29th ACM International Conference on Multimedia},
  year={2021}
}
Owner
WangWen
WangWen
League of Legends Reinforcement Learning Environment (LoLRLE) multiple training scenarios using PPO.

League of Legends Reinforcement Learning Environment (LoLRLE) About This repo contains code to train an agent to play league of legends in a distribut

2 Aug 19, 2022
Chinese Mandarin tts text-to-speech 中文 (普通话) 语音 合成 , by fastspeech 2 , implemented in pytorch, using waveglow as vocoder,

Chinese mandarin text to speech based on Fastspeech2 and Unet This is a modification and adpation of fastspeech2 to mandrin(普通话). Many modifications t

291 Jan 02, 2023
PyTorch Code for NeurIPS 2021 paper Anti-Backdoor Learning: Training Clean Models on Poisoned Data.

Anti-Backdoor Learning PyTorch Code for NeurIPS 2021 paper Anti-Backdoor Learning: Training Clean Models on Poisoned Data. The Anti-Backdoor Learning

Yige-Li 51 Dec 07, 2022
Repository to run object detection on a model trained on an autonomous driving dataset.

Autonomous Driving Object Detection on the Raspberry Pi 4 Description of Repository This repository contains code and instructions to configure the ne

Ethan 51 Nov 17, 2022
AFLNet: A Greybox Fuzzer for Network Protocols

AFLNet: A Greybox Fuzzer for Network Protocols AFLNet is a greybox fuzzer for protocol implementations. Unlike existing protocol fuzzers, it takes a m

626 Jan 06, 2023
H&M Fashion Image similarity search with Weaviate and DocArray

H&M Fashion Image similarity search with Weaviate and DocArray This example shows how to do image similarity search using DocArray and Weaviate as Doc

Laura Ham 18 Aug 11, 2022
Federated Deep Reinforcement Learning for the Distributed Control of NextG Wireless Networks.

FDRL-PC-Dyspan Federated Deep Reinforcement Learning for the Distributed Control of NextG Wireless Networks. This repository contains the entire code

Peyman Tehrani 17 Nov 18, 2022
Official repository for HOTR: End-to-End Human-Object Interaction Detection with Transformers (CVPR'21, Oral Presentation)

Official PyTorch Implementation for HOTR: End-to-End Human-Object Interaction Detection with Transformers (CVPR'2021, Oral Presentation) HOTR: End-to-

Kakao Brain 114 Nov 28, 2022
Official implementation of "Generating 3D Molecules for Target Protein Binding"

Generating 3D Molecules for Target Protein Binding This is the official implementation of the GraphBP method proposed in the following paper. Meng Liu

DIVE Lab, Texas A&M University 74 Dec 07, 2022
Pacman-AI - AI project designed by UC Berkeley. Designed reflex and minimax agents for the game Pacman.

Pacman AI Jussi Doherty CAP 4601 - Introduction to Artificial Intelligence - Fall 2020 Python version 3.0+ Source of this project This repo contains a

Jussi Doherty 1 Jan 03, 2022
Codebase for "ProtoAttend: Attention-Based Prototypical Learning."

Codebase for "ProtoAttend: Attention-Based Prototypical Learning." Authors: Sercan O. Arik and Tomas Pfister Paper: Sercan O. Arik and Tomas Pfister,

47 2 May 17, 2022
Forest R-CNN: Large-Vocabulary Long-Tailed Object Detection and Instance Segmentation (ACM MM 2020)

Forest R-CNN: Large-Vocabulary Long-Tailed Object Detection and Instance Segmentation (ACM MM 2020) Official implementation of: Forest R-CNN: Large-Vo

Jialian Wu 54 Jan 06, 2023
Official PyTorch implementation and pretrained models of the paper Self-Supervised Classification Network

Self-Classifier: Self-Supervised Classification Network Official PyTorch implementation and pretrained models of the paper Self-Supervised Classificat

Elad Amrani 24 Dec 21, 2022
Atif Hassan 103 Dec 14, 2022
Aydin is a user-friendly, feature-rich, and fast image denoising tool

Aydin is a user-friendly, feature-rich, and fast image denoising tool that provides a number of self-supervised, auto-tuned, and unsupervised image denoising algorithms.

Royer Lab 99 Dec 14, 2022
A generalist algorithm for cell and nucleus segmentation.

Cellpose | A generalist algorithm for cell and nucleus segmentation. Cellpose was written by Carsen Stringer and Marius Pachitariu. To learn about Cel

MouseLand 733 Dec 29, 2022
A PyTorch implementation of SlowFast based on ICCV 2019 paper "SlowFast Networks for Video Recognition"

SlowFast A PyTorch implementation of SlowFast based on ICCV 2019 paper SlowFast Networks for Video Recognition. Requirements Anaconda PyTorch conda in

Hao Ren 8 Dec 23, 2022
PyTorch implementation of the Pose Residual Network (PRN)

Pose Residual Network This repository contains a PyTorch implementation of the Pose Residual Network (PRN) presented in our ECCV 2018 paper: Muhammed

Salih Karagoz 289 Nov 28, 2022
Finding Donors for CharityML

Finding-Donors-for-CharityML - Investigated factors that affect the likelihood of charity donations being made based on real census data.

Moamen Abdelkawy 1 Dec 30, 2021
This is the pytorch re-implementation of the IterNorm

IterNorm-pytorch Pytorch reimplementation of the IterNorm methods, which is described in the following paper: Iterative Normalization: Beyond Standard

Lei Huang 32 Dec 27, 2022