R3Det: Refined Single-Stage Detector with Feature Refinement for Rotating Object
Installation
# install mmdetection first if you haven't installed it yet. (Refer to mmdetection for details.)
pip install mmdet==2.19.0
# install r3det (Compiling rotated ops is a little time-consuming.)
pip install -r requirements.txt
pip install -v -e .
- It is best to use opencv-python greater than 4.5.1 because its angle representation has been changed in 4.5.1. The following experiments are all run with 4.5.3.
Quick Start
Please change path in configs to your data path.
# train
CUDA_VISIBLE_DEVICES=0 PORT=29500 \
./tools/dist_train.sh configs/rretinanet/rretinanet_obb_r50_fpn_1x_dota_v3.py 1
# submission
CUDA_VISIBLE_DEVICES=0 PORT=29500 \
./tools/dist_test.sh configs/rretinanet/rretinanet_obb_r50_fpn_1x_dota_v3.py \
work_dirs/rretinanet_obb_r50_fpn_1x_dota_v3/epoch_12.pth 1 --format-only\
--eval-options submission_dir=work_dirs/rretinanet_obb_r50_fpn_1x_dota_v3/Task1_results
For DOTA dataset, please crop the original images into 1024×1024 patches with an overlap of 200 by run
python tools/split/img_split.py --base_json \
tools/split/split_configs/split_configs/dota1_0/ss_trainval.json
python tools/split/img_split.py --base_json \
tools/split/split_configs/dota1_0/ss_test.json
Please change path in ss_trainval.json, ss_test.json to your path. (Forked from BboxToolkit, which is faster then DOTA_Devkit.)
Angle Representations
Three angle representations are built-in, which can freely switch in the config.
v1
(from R3Det): [-PI/2, 0)v2
(from S2ANet): [-Pi/4, 3PI/4)v3
(from OBBDetection): [-PI/2, PI/2)
The differences of the three angle representations are reflected in poly2obb, obb2poly, obb2xyxy, obb2hbb, hbb2obb, etc. [More], And according to the above three papers, the coders of them are different.
- DeltaXYWHAOBBoxCoder
v1
:Nonev2
:Constrained angle + Projection of dx and dy + Normalized with PIv3
:Constrained angle and length&width + Projection of dx and dy
- DeltaXYWHAHBBoxCoder
v1
:Nonev2
:Constrained angle + Normalized with PIv3
:Constrained angle and length&width + Normalized with 2PI
We believe that different coders are the key reason for the different baselines in different papers. The good news is that all the above coders can be freely switched in R3Det. In addition, R3Det also provide 4 NMS ops and 3 IoU_Calculators for rotation detection as follows:
nms.type
- v1:
v1
- v2:
v2
- v3:
v3
- mmcv:
mmcv
- v1:
iou_calculator
- v1:
RBboxOverlaps2D_v1
- v2:
RBboxOverlaps2D_v2
- v3:
RBboxOverlaps2D_v3
- v1:
Performance
Model | Backbone | Lr schd | MS | RR | Angle | box AP | Official | Download |
---|---|---|---|---|---|---|---|---|
RRetinaNet HBB | R50-FPN | 1x | - | - | v1 | 65.19 | 65.73 | Baidu:0518/Google |
RRetinaNet OBB | R50-FPN | 1x | - | - | v3 | 68.20 | 69.40 | Baidu:0518/Google |
RRetinaNet OBB | R50-FPN | 1x | - | - | v2 | 68.64 | 68.40 | Baidu:0518/Google |
R3Det | R50-FPN | 1x | - | - | v1 | 70.41 | 70.66 | Baidu:0518/Google |
R3Det* | R50-FPN | 1x | - | - | v1 | 70.86 | - | Baidu:0518/Google |
MS
means multiple scale image split.RR
means random rotation.
Citation
@inproceedings{yang2021r3det,
title={R3Det: Refined Single-Stage Detector with Feature Refinement for Rotating Object},
author={Yang, Xue and Yan, Junchi and Feng, Ziming and He, Tao},
booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
volume={35},
number={4},
pages={3163--3171},
year={2021}
}