This is the implementation of GGHL (A General Gaussian Heatmap Labeling for Arbitrary-Oriented Object Detection)

Deep Learning, pytorch

GGHL: A General Gaussian Heatmap Labeling for Arbitrary-Oriented Object Detection

This is the implementation of GGHL

👹 Abstract of the paper

Recently, many arbitrary-oriented object detection (AOOD) methods have been proposed and attracted widespread attention in many fields. However, most of them are based on anchor-boxes or standard Gaussian heatmaps. Such label assignment strategy may not only fail to reflect the shape and direction characteristics of arbitrary-oriented objects, but also have high parameter-tuning efforts. In this paper, a novel AOOD method called General Gaussian Heatmap Labeling (GGHL) is proposed. Specifically, an anchor-free object adaptation label assignment (OLA) strategy is presented to define the positive candidates based on two-dimensional (2-D) oriented Gaussian heatmaps, which reflect the shape and direction features of arbitrary-oriented objects. Based on OLA, an oriented-boundingbox (OBB) representation component (ORC) is developed to indicate OBBs and adjust the Gaussian center prior weights to fit the characteristics of different objects adaptively through neural network learning. Moreover, a joint-optimization loss (JOL) with area normalization and dynamic confidence weighting is designed to refine the misalign optimal results of different subtasks. Extensive experiments on public datasets demonstrate that the proposed GGHL improves the AOOD performance with low parameter-tuning and time costs. Furthermore, it is generally applicable to most AOOD methods to improve their performance including lightweight models on embedded platforms.

  12.15 The trained models for DOTAv1.5 and DOTAv2.0 dataset are available. Google Drive or Baidu Disk(password: yn04)

  11.19 During label conversion, it should be noted that the vertices in the paper are in order (see the paper for details).

There are still some uncompleted content that is being continuously updated. Thank you for your feedback and suggestions.

  11.16 The script for generating datasets in the format required by GGHL is added in ./datasets_tools/


  11.15 The models for the SKU dataset are available


  11.10 Add DCNv2 for automatic mixed precision (AMP) training.

🌈 1.Environments

Linux (Ubuntu 18.04, GCC>=5.4) & Windows (Win10)
CUDA > 11.1, Cudnn > 8.0.4

First, install CUDA, Cudnn, and Pytorch. Second, install the dependent libraries in requirements.txt.

conda install pytorch==1.8.1 torchvision==0.9.1 torchaudio==0.8.1 cudatoolkit=11.1 -c pytorch -c conda-forge   
pip install -r requirements.txt  

🌟 2.Installation

  1. git clone this repository
  2. Install the libraries in the ./lib folder
    (1) DCNv2
cd ./GGHL/lib/DCNv2/  
  1. Polygen NMS
    The poly_nms in this version is implemented using shapely and numpy libraries to ensure that it can work in different systems and environments without other dependencies. But doing so will slow down the detection speed in dense object scenes. If you want faster speed, you can compile and use the poly_iou library (C++ implementation version) in datasets_tools/DOTA_devkit. The compilation method is described in detail in DOTA_devkit .
cd datasets_tools/DOTA_devkit
sudo apt-get install swig
swig -c++ -python polyiou.i
python build_ext --inplace 

🎃 3.Datasets

  1. DOTA dataset and its devkit

(1) Training Format

You need to write a script to convert them into the train.txt file required by this repository and put them in the ./dataR folder.
For the specific format of the train.txt file, see the example in the /dataR folder.

image_path xmin,ymin,xmax,ymax,class_id,x1,y1,x2,y2,x3,y3,x4,y4,area_ratio,angle[0,180) xmin,ymin,xmax,ymax,class_id,x1,y1,x2,y2,x3,y3,x4,y4,area_ratio,angle[0,180)...

The calculation method of angle is explained in Issues #1 and our paper.

(2) Testing Format

The same as the Pascal VOC Format

(3) DataSets Files Structure

cfg.DATA_PATH = "/opt/datasets/DOTA/"
├── ...
├── JPEGImages
|   ├── 000001.png
|   ├── 000002.png
|   └── ...
├── Annotations (DOTA Dataset Format)
|   ├── 000001.txt (class_idx x1 y1 x2 y2 x3 y3 x4 y4)
|   ├── 000002.txt
|   └── ...
├── ImageSets
    ├── test.txt (testing filename)
        ├── 000001
        ├── 000002
        └── ...

There is a file in the datasets_tools folder that can be used to generate training and test format labels. First, you need to use DOTA_devkit , the official tools of the DOTA dataset, for image and label splitting. Then, run to convert them to the format required by GGHL. For the use of DOTA_devkit, please refer to the tutorial in the official repository.

🌠 🌠 🌠 4.Usage Example

(1) Training


(2) For Distributed Training


(3) Testing


☃️ ❄️ 5.Weights

1)The trained model for DOTA dataset is available from Google Drive or Baidu Disk (password: 2dm8)
Put them in. /weight folder

2)The trained model for SKU dataset is available from Google Drive or Baidu Disk(password: c3jv)

3)The trained model for SKU dataset is available from Google Drive or Baidu Disk(password: vdf5)

4)The pre-trained weights of Darknet53 on ImageNet are available from Google_Drive or Baidu_Disk(password:0blv)

  1. The trained model for DOTAv1.5 dataset is available from Google Drive or Baidu Disk(password: wxlj)

  2. The trained model for DOTAv2.0 dataset is available from Google Drive or Baidu Disk(password: dmu7)

💖 💖 💖 6.Reference

