FCOSR: A Simple Anchor-free Rotated Detector for Aerial Object Detection

Overview

FCOSR: A Simple Anchor-free Rotated Detector for Aerial Object Detection

FCOSR: A Simple Anchor-free Rotated Detector for Aerial Object Detection
arXiv preprint (arXiv:2111.10780).

This implement is modified from mmdetection. We also refer to the codes of ReDet, PIoU, and ProbIoU.

In the process of implementation, we find that only Python code processing will produce huge memory overhead on Nvidia devices. Therefore, we directly write the label assignment module proposed in this paper in the form of CUDA extension of Pytorch. The program could not work effectively when we migrate it to cuda 11 (only support cuda10). By applying CUDA expansion, the memory utilization is improved and a lot of unnecessary calculations are reduced. We also try to train FCOSR-M on 2080ti (4 images per device), which can basically fill memory of graphics card.

FCOSR TensorRT inference code is available at: https://github.com/lzh420202/TensorRT_Inference

We add a multiprocess version DOTA2COCO into DOTA_devkit package, you could switch USE_MULTI_PROCESS to control the function in prepare_dota.py

Install

Please refer to install.md for installation and dataset preparation.

Getting Started

Please see get_started.md for the basic usage.

Model Zoo

Speed vs Accuracy on DOTA 1.0 test set

benchmark

Details (Test device: nvidia RTX 2080ti)

Methods backbone FPS mAP(%)
ReDet ReR50 8.8 76.25
S2ANet Mobilenet v2 18.9 67.46
S2ANet R50 14.4 74.14
R3Det R50 9.2 71.9
Oriented-RCNN Mobilenet v2 21.2 72.72
Oriented-RCNN R50 13.8 75.87
Oriented-RCNN R101 11.3 76.28
RetinaNet-O Mobilenet v2 22.4 67.95
RetinaNet-O R50 16.5 72.7
RetinaNet-O R101 13.3 73.7
Faster-RCNN-O Mobilenet v2 23 67.41
Faster-RCNN-O R50 14.4 72.29
Faster-RCNN-O R101 11.4 72.65
FCOSR-S Mobilenet v2 23.7 74.05
FCOSR-M Rx50 14.6 77.15
FCOSR-L Rx101 7.9 77.39

The password of baiduPan is ABCD

FCOSR serise DOTA 1.0 result.FPS(2080ti) Detail

Model backbone MS Sched. Param. Input GFLOPs FPS mAP download
FCOSR-S Mobilenet v2 - 3x 7.32M 1024×1024 101.42 23.7 74.05 model/cfg
FCOSR-S Mobilenet v2 3x 7.32M 1024×1024 101.42 23.7 76.11 model/cfg
FCOSR-M ResNext50-32x4 - 3x 31.4M 1024×1024 210.01 14.6 77.15 model/cfg
FCOSR-M ResNext50-32x4 3x 31.4M 1024×1024 210.01 14.6 79.25 model/cfg
FCOSR-L ResNext101-64x4 - 3x 89.64M 1024×1024 445.75 7.9 77.39 model/cfg
FCOSR-L ResNext101-64x4 3x 89.64M 1024×1024 445.75 7.9 78.80 model/cfg

FCOSR serise DOTA 1.5 result. FPS(2080ti) Detail

Model backbone MS Sched. Param. Input GFLOPs FPS mAP download
FCOSR-S Mobilenet v2 - 3x 7.32M 1024×1024 101.42 23.7 66.37 model/cfg
FCOSR-S Mobilenet v2 3x 7.32M 1024×1024 101.42 23.7 73.14 model/cfg
FCOSR-M ResNext50-32x4 - 3x 31.4M 1024×1024 210.01 14.6 68.74 model/cfg
FCOSR-M ResNext50-32x4 3x 31.4M 1024×1024 210.01 14.6 73.79 model/cfg
FCOSR-L ResNext101-64x4 - 3x 89.64M 1024×1024 445.75 7.9 69.96 model/cfg
FCOSR-L ResNext101-64x4 3x 89.64M 1024×1024 445.75 7.9 75.41 model/cfg

FCOSR serise HRSC2016 result. FPS(2080ti)

Model backbone Rot. Sched. Param. Input GFLOPs FPS AP50(07) AP75(07) AP50(12) AP75(12) download
FCOSR-S Mobilenet v2 40k iters 7.29M 800×800 61.57 35.3 90.08 76.75 92.67 75.73 model/cfg
FCOSR-M ResNext50-32x4 40k iters 31.37M 800×800 127.87 26.9 90.15 78.58 94.84 81.38 model/cfg
FCOSR-L ResNext101-64x4 40k iters 89.61M 800×800 271.75 15.1 90.14 77.98 95.74 80.94 model/cfg

Lightweight FCOSR test result on Jetson Xavier NX (DOTA 1.0 single-scale). Detail

Model backbone Head channels Sched. Param Size Input GFLOPs FPS mAP onnx TensorRT
FCOSR-lite Mobilenet v2 256 3x 6.9M 51.63MB 1024×1024 101.25 7.64 74.30 onnx trt
FCOSR-tiny Mobilenet v2 128 3x 3.52M 23.2MB 1024×1024 35.89 10.68 73.93 onnx trt

Lightweight FCOSR test result on Jetson AGX Xavier (DOTA 1.0 single-scale).

A part of Dota1.0 dataset (whole image mode) Code

name size patch size gap patches det objects det time(s)
P0031.png 5343×3795 1024 200 35 1197 2.75
P0051.png 4672×5430 1024 200 42 309 2.38
P0112.png 6989×4516 1024 200 54 184 3.02
P0137.png 5276×4308 1024 200 35 66 1.95
P1004.png 7001×3907 1024 200 45 183 2.52
P1125.png 7582×4333 1024 200 54 28 2.95
P1129.png 4093×6529 1024 200 40 70 2.23
P1146.png 5231×4616 1024 200 42 64 2.29
P1157.png 7278×5286 1024 200 63 184 3.47
P1378.png 5445×4561 1024 200 42 83 2.32
P1379.png 4426×4182 1024 200 30 686 1.78
P1393.png 6072×6540 1024 200 64 893 3.63
P1400.png 6471×4479 1024 200 48 348 2.63
P1402.png 4112×4793 1024 200 30 293 1.68
P1406.png 6531×4182 1024 200 40 19 2.19
P1415.png 4894x4898 1024 200 36 190 1.99
P1436.png 5136×5156 1024 200 42 39 2.31
P1448.png 7242×5678 1024 200 63 51 3.41
P1457.png 5193×4658 1024 200 42 382 2.33
P1461.png 6661×6308 1024 200 64 27 3.45
P1494.png 4782×6677 1024 200 48 70 2.61
P1500.png 4769×4386 1024 200 36 92 1.96
P1772.png 5963×5553 1024 200 49 28 2.70
P1774.png 5352×4281 1024 200 35 291 1.95
P1796.png 5870×5822 1024 200 49 308 2.74
P1870.png 5942×6059 1024 200 56 135 3.04
P2043.png 4165×3438 1024 200 20 1479 1.49
P2329.png 7950×4334 1024 200 60 83 3.26
P2641.png 7574×5625 1024 200 63 269 3.41
P2642.png 7039×5551 1024 200 63 451 3.50
P2643.png 7568×5619 1024 200 63 249 3.40
P2645.png 4605×3442 1024 200 24 357 1.42
P2762.png 8074×4359 1024 200 60 127 3.23
P2795.png 4495×3981 1024 200 30 65 1.64
Code for the ICCV'21 paper "Context-aware Scene Graph Generation with Seq2Seq Transformers"

ICCV'21 Context-aware Scene Graph Generation with Seq2Seq Transformers Authors: Yichao Lu*, Himanshu Rai*, Cheng Chang*, Boris Knyazev†, Guangwei Yu,

Layer6 Labs 37 Dec 18, 2022
Pytorch-3dunet - 3D U-Net model for volumetric semantic segmentation written in pytorch

pytorch-3dunet PyTorch implementation 3D U-Net and its variants: Standard 3D U-Net based on 3D U-Net: Learning Dense Volumetric Segmentation from Spar

Adrian Wolny 1.3k Dec 28, 2022
Benchmarks for Model-Based Optimization

Design-Bench Design-Bench is a benchmarking framework for solving automatic design problems that involve choosing an input that maximizes a black-box

Brandon Trabucco 43 Dec 20, 2022
A Python library for working with arbitrary-dimension hypercomplex numbers following the Cayley-Dickson construction of algebras.

Hypercomplex A Python library for working with quaternions, octonions, sedenions, and beyond following the Cayley-Dickson construction of hypercomplex

7 Nov 04, 2022
Pairwise model for commonlit competition

Pairwise model for commonlit competition To run: - install requirements - create input directory with train_folds.csv and other competition data - cd

abhishek thakur 45 Aug 31, 2022
Multi-query Video Retreival

Multi-query Video Retreival

Princeton Visual AI Lab 17 Nov 22, 2022
Controlling the MicriSpotAI robot from scratch

Project-MicroSpot-AI Controlling the MicriSpotAI robot from scratch Colaborators Alexander Dennis Components from MicroSpot The MicriSpotAI has the fo

Dennis Núñez-Fernández 5 Oct 20, 2022
Bidimensional Leaderboards: Generate and Evaluate Language Hand in Hand

Bidimensional Leaderboards: Generate and Evaluate Language Hand in Hand Introduction We propose a generalization of leaderboards, bidimensional leader

4 Dec 03, 2022
Amazon Forest Computer Vision: Satellite Image tagging code using PyTorch / Keras with lots of PyTorch tricks

Amazon Forest Computer Vision Satellite Image tagging code using PyTorch / Keras Here is a sample of images we had to work with Source: https://www.ka

Mamy Ratsimbazafy 359 Jan 05, 2023
Code for EMNLP'21 paper "Types of Out-of-Distribution Texts and How to Detect Them"

ood-text-emnlp Code for EMNLP'21 paper "Types of Out-of-Distribution Texts and How to Detect Them" Files fine_tune.py is used to finetune the GPT-2 mo

Udit Arora 19 Oct 28, 2022
Unofficial pytorch implementation of 'Image Inpainting for Irregular Holes Using Partial Convolutions'

pytorch-inpainting-with-partial-conv Official implementation is released by the authors. Note that this is an ongoing re-implementation and I cannot f

Naoto Inoue 525 Jan 01, 2023
Online-compatible Unsupervised Non-resonant Anomaly Detection Repository

Online-compatible Unsupervised Non-resonant Anomaly Detection Repository Repository containing all scripts used in the studies of Online-compatible Un

0 Nov 09, 2021
"Reinforcement Learning for Bandit Neural Machine Translation with Simulated Human Feedback"

This is code repo for our EMNLP 2017 paper "Reinforcement Learning for Bandit Neural Machine Translation with Simulated Human Feedback", which implements the A2C algorithm on top of a neural encoder-

Khanh Nguyen 131 Oct 21, 2022
Learning Continuous Signed Distance Functions for Shape Representation

DeepSDF This is an implementation of the CVPR '19 paper "DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation" by Park et a

Meta Research 1.1k Jan 01, 2023
LEDNet: A Lightweight Encoder-Decoder Network for Real-time Semantic Segmentation

LEDNet: A Lightweight Encoder-Decoder Network for Real-time Semantic Segmentation Table of Contents: Introduction Project Structure Installation Datas

Yu Wang 492 Dec 02, 2022
ManipulaTHOR, a framework that facilitates visual manipulation of objects using a robotic arm

ManipulaTHOR: A Framework for Visual Object Manipulation Kiana Ehsani, Winson Han, Alvaro Herrasti, Eli VanderBilt, Luca Weihs, Eric Kolve, Aniruddha

AI2 65 Dec 30, 2022
SpecAugmentPyTorch - A Pytorch (support batch and channel) implementation of GoogleBrain's SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition

SpecAugment An implementation of SpecAugment for Pytorch How to use Install pytorch, version=1.9.0 (new feature (torch.Tensor.take_along_dim) is used

IMLHF 3 Oct 11, 2022
Created as part of CS50 AI's coursework. This AI makes use of knowledge entailment to calculate the best probabilities to win Minesweeper.

Minesweeper-AI Created as part of CS50 AI's coursework. This AI makes use of knowledge entailment to calculate the best probabilities to win Minesweep

Beckham 0 Jul 20, 2022
Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis

Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis Website | ICCV paper | arXiv | Twitter This repository contains the official i

Ajay Jain 73 Dec 27, 2022
Nvdiffrast - Modular Primitives for High-Performance Differentiable Rendering

Nvdiffrast – Modular Primitives for High-Performance Differentiable Rendering Modular Primitives for High-Performance Differentiable Rendering Samuli

NVIDIA Research Projects 675 Jan 06, 2023