Oriented Object Detection: Oriented RepPoints + Swin Transformer/ReResNet

Last update: Dec 13, 2022

Overview

Oriented RepPoints for Aerial Object Detection

The code for the implementation of “Oriented RepPoints + Swin Transformer/ReResNet”.

Introduction

Based on the Oriented Reppoints detector with Swin Transformer backbone, the 3rd Place is achieved on the Task 1 and the 2nd Place is achieved on the Task 2 of 2021 challenge of Learning to Understand Aerial Images (LUAI) held on ICCV’2021. The detailed information is introduced in this paper of "LUAI Challenge 2021 on Learning to Understand Aerial Images, ICCVW2021".

New Feature

BackBone: add Swin-Transformer, ReResNet
DataAug: add Mosaic4or9, Mixup, HSV, RandomPerspective, RandomScaleCrop

Installation

Please refer to for installation and dataset preparation.

Getting Started

This repo is based on . Please see for the basic usage.

Results and Models

The results on DOTA test-dev set are shown in the table below(password:aabb/swin/ABCD). More detailed results please see the paper.

Model	Backbone	MS	DataAug	DOTAv1 mAP	DOTAv2 mAP	Download
OrientedReppoints	R-50	-	-	75.68	-	baidu(aabb)
OrientedReppoints	R-101	-	√	76.21	-	baidu(aabb)
OrientedReppoints	R-101	√	√	78.12	-	baidu(aabb)
OrientedReppoints	SwinT-tiny	-	√	-	-	-

ImageNet-1K and ImageNet-22K Pretrained Models

name	pretrain	resolution	[email protected]	[email protected]	#params	FLOPs	FPS	22K model	1K model	Need to turn read version
Swin-T	ImageNet-1K	224x224	81.2	95.5	28M	4.5G	755	-	github/baidu(swin)/config	✔
Swin-S	ImageNet-1K	224x224	83.2	96.2	50M	8.7G	437	-	github/baidu(swin)/config	✔
Swin-B	ImageNet-1K	224x224	83.5	96.5	88M	15.4G	278	-	github/baidu(swin)/config	✔
Swin-B	ImageNet-1K	384x384	84.5	97.0	88M	47.1G	85	-	github/baidu(swin)/test-config	✔
Swin-B	ImageNet-22K	224x224	85.2	97.5	88M	15.4G	278	github/baidu(swin)	github/baidu(swin)/test-config	✔
Swin-B	ImageNet-22K	384x384	86.4	98.0	88M	47.1G	85	github/baidu(swin)	github/baidu(swin)/test-config	✔
Swin-L	ImageNet-22K	224x224	86.3	97.9	197M	34.5G	141	github/baidu(swin)	github/baidu(swin)/test-config	✔
Swin-L	ImageNet-22K	384x384	87.3	98.2	197M	103.9G	42	github/baidu(swin)	github/baidu(swin)/test-config	✔
ReResNet50	ImageNet-1K	224x224	71.20	90.28	-	-	-	-	google/baidu(ABCD)/log	-

The mAOE results on DOTAv1 val set are shown in the table below(password:aabb).

Model	Backbone	mAOE	Download
OrientedReppoints	R-50	5.93°	baidu(aabb)

Note：

Wtihout the ground-truth of test subset, the mAOE of orientation evaluation is calculated on the val subset(original train subset for training).
The orientation (angle) of an aerial object is define as below, the detail of mAOE, please see the paper. The code of mAOE is mAOE_evaluation.py.

Visual results

The visual results of learning points and the oriented bounding boxes. The visualization code is .

Learning points

Oriented bounding box

Citation

@article{Li2021oriented,
  title={Oriented RepPoints for Aerial Object Detection},
  author={Wentong Li and Jianke Zhu},
  journal={arXiv preprint arXiv:2105.11111},
  year={2021}
}

Acknowledgements

I have used utility functions from other wonderful open-source projects. Espeicially thank the authors of:

OrientedRepPoints

Swin-Transformer-Object-Detection

ReDet

Oriented Object Detection: Oriented RepPoints + Swin Transformer/ReResNet

Related tags

Overview

Oriented RepPoints for Aerial Object Detection

Introduction

New Feature

Installation

Getting Started

Results and Models

Visual results

Citation

Acknowledgements

Owner

Hydra Lightning Template for Structured Configs

Pytorch implementation of "Geometrically Adaptive Dictionary Attack on Face Recognition" (WACV 2022)

Code for generating a single image pretraining dataset

The Few-Shot Bot: Prompt-Based Learning for Dialogue Systems

An implementation of the methods presented in Causal-BALD: Deep Bayesian Active Learning of Outcomes to Infer Treatment-Effects from Observational Data.

Open-L2O: A Comprehensive and Reproducible Benchmark for Learning to Optimize Algorithms

Finetune SSL models for MOS prediction

PyTorch implementation of a Real-ESRGAN model trained on custom dataset

Implementation for Panoptic-PolarNet (CVPR 2021)

Realtime micro-expression recognition using OpenCV and PyTorch

Emotion Recognition from Facial Images

Align and Prompt: Video-and-Language Pre-training with Entity Prompts

This is the official github repository of the Met dataset

code for CVPR paper Zero-shot Instance Segmentation

GDSC-ML Team Interview Task

Re-implementation of the vector capsule with dynamic routing

Experimental code for paper: Generative Adversarial Networks as Variational Training of Energy Based Models

Bottom-up Human Pose Estimation

Complete U-net Implementation with keras

This repository contains the data and code for the paper "Diverse Text Generation via Variational Encoder-Decoder Models with Gaussian Process Priors" ([email protected])