FCOSR: A Simple Anchor-free Rotated Detector for Aerial Object Detection

Last update: Nov 29, 2022

Related tags

Overview

FCOSR: A Simple Anchor-free Rotated Detector for Aerial Object Detection

FCOSR: A Simple Anchor-free Rotated Detector for Aerial Object Detection
arXiv preprint (arXiv:2111.10780).

This implement is modified from mmdetection. We also refer to the codes of ReDet, PIoU, and ProbIoU.

In the process of implementation, we find that only Python code processing will produce huge memory overhead on Nvidia devices. Therefore, we directly write the label assignment module proposed in this paper in the form of CUDA extension of Pytorch. The program could not work effectively when we migrate it to cuda 11 (only support cuda10). By applying CUDA expansion, the memory utilization is improved and a lot of unnecessary calculations are reduced. We also try to train FCOSR-M on 2080ti (4 images per device), which can basically fill memory of graphics card.

Install

Please refer to install.md for installation and dataset preparation.

Getting Started

Please see get_started.md for the basic usage.

Model Zoo

The password of baiduPan is ABCD

FCOSR serise DOTA 1.0 result.FPS(2080ti) Detail

Model	backbone	MS	Sched.	Param.	Input	GFLOPs	FPS	mAP	download
FCOSR-S	Mobilenet v2	-	3x	7.32M	1024×1024	101.42	23.7	74.05	model/cfg
FCOSR-S	Mobilenet v2	✓	3x	7.32M	1024×1024	101.42	23.7	76.11	model/cfg
FCOSR-M	ResNext50-32x4	-	3x	31.4M	1024×1024	210.01	14.6	77.15	model/cfg
FCOSR-M	ResNext50-32x4	✓	3x	31.4M	1024×1024	210.01	14.6	79.25	model/cfg
FCOSR-L	ResNext101-64x4	-	3x	89.64M	1024×1024	445.75	7.9	77.39	model/cfg
FCOSR-L	ResNext101-64x4	✓	3x	89.64M	1024×1024	445.75	7.9	78.80	model/cfg

FCOSR serise DOTA 1.5 result. FPS(2080ti) Detail

Model	backbone	MS	Sched.	Param.	Input	GFLOPs	FPS	mAP	download
FCOSR-S	Mobilenet v2	-	3x	7.32M	1024×1024	101.42	23.7	66.37	model/cfg
FCOSR-S	Mobilenet v2	✓	3x	7.32M	1024×1024	101.42	23.7	73.14	model/cfg
FCOSR-M	ResNext50-32x4	-	3x	31.4M	1024×1024	210.01	14.6	68.74	model/cfg
FCOSR-M	ResNext50-32x4	✓	3x	31.4M	1024×1024	210.01	14.6	73.79	model/cfg
FCOSR-L	ResNext101-64x4	-	3x	89.64M	1024×1024	445.75	7.9	69.96	model/cfg
FCOSR-L	ResNext101-64x4	✓	3x	89.64M	1024×1024	445.75	7.9	75.41	model/cfg

FCOSR serise HRSC2016 result. FPS(2080ti)

Model	backbone	Rot.	Sched.	Param.	Input	GFLOPs	FPS	AP50(07)	AP75(07)	AP50(12)	AP75(12)	download
FCOSR-S	Mobilenet v2	✓	40k iters	7.29M	800×800	61.57	35.3	90.08	76.75	92.67	75.73	model/cfg
FCOSR-M	ResNext50-32x4	✓	40k iters	31.37M	800×800	127.87	26.9	90.15	78.58	94.84	81.38	model/cfg
FCOSR-L	ResNext101-64x4	✓	40k iters	89.61M	800×800	271.75	15.1	90.14	77.98	95.74	80.94	model/cfg

Lightweight FCOSR test result on Jetson Xavier NX (DOTA 1.0 single-scale). Detail

Model	backbone	Head channels	Sched.	Param	Size	Input	GFLOPs	FPS	mAP	onnx	TensorRT
FCOSR-lite	Mobilenet v2	256	3x	6.9M	51.63MB	1024×1024	101.25	7.64	74.30	Wait	rtr
FCOSR-tiny	Mobilenet v2	128	3x	3.52M	23.2MB	1024×1024	35.89	10.68	73.93	Wait	rtr

FCOSR: A Simple Anchor-free Rotated Detector for Aerial Object Detection

Related tags

Overview

FCOSR: A Simple Anchor-free Rotated Detector for Aerial Object Detection

Install

Getting Started

Model Zoo

FCOSR serise DOTA 1.0 result.FPS(2080ti) Detail

FCOSR serise DOTA 1.5 result. FPS(2080ti) Detail

FCOSR serise HRSC2016 result. FPS(2080ti)

Lightweight FCOSR test result on Jetson Xavier NX (DOTA 1.0 single-scale). Detail

Owner

Contains code for the paper "Vision Transformers are Robust Learners".

Few-shot Neural Architecture Search

Python implementation of the multistate Bennett acceptance ratio (MBAR)

The source code for 'Noisy-Labeled NER with Confidence Estimation' accepted by NAACL 2021

STRIVE: Scene Text Replacement In Videos

Deep Learning & 3D Convolutional Neural Networks for Speaker Verification

This repository is the code of the paper "Sparse Spatial Transformers for Few-Shot Learning".

Library for implementing reservoir computing models (echo state networks) for multivariate time series classification and clustering.

[NeurIPS 2021] "Delayed Propagation Transformer: A Universal Computation Engine towards Practical Control in Cyber-Physical Systems"

Training, generation, and analysis code for Learning Particle Physics by Example: Location-Aware Generative Adversarial Networks for Physics

本步态识别系统主要基于GaitSet模型进行实现

Official Pytorch Implementation of GraphiT

Repo for the Tutorials of Day1-Day3 of the Nordic Probabilistic AI School 2021 (https://probabilistic.ai/)

Code and data of the Fine-Grained R2R Dataset proposed in paper Sub-Instruction Aware Vision-and-Language Navigation

A script depending on VASP output for calculating Fermi-Softness.

Inference code for "StylePeople: A Generative Model of Fullbody Human Avatars" paper. This code is for the part of the paper describing video-based avatars.

[ACM MM 2021] Multiview Detection with Shadow Transformer (and View-Coherent Data Augmentation)

Official implementation for the paper "Attentive Prototypes for Source-free Unsupervised Domain Adaptive 3D Object Detection"

Identify the emotion of multiple speakers in an Audio Segment

X-modaler is a versatile and high-performance codebase for cross-modal analytics.