This is the implementation of GGHL (A General Gaussian Heatmap Labeling for Arbitrary-Oriented Object Detection)

Related tags

Deep Learningpytorch
Overview

GGHL: A General Gaussian Heatmap Labeling for Arbitrary-Oriented Object Detection

Version GPLv3.0 License Visitor E-mail

This is the implementation of GGHL 👋 👋 👋

[Arxiv] [Google Drive][Baidu Disk (password: yn04)]

Give a ⭐️ if this project helped you. If you use it, please consider citing:

article{huang2021general,
  title = {A General Gaussian Heatmap Labeling for Arbitrary-Oriented Object Detection},  
  author = {Huang, Zhanchao and Li, Wei and Xia, Xiang-Gen and Tao, Ran},  
  year = {2021},  
  journal = {arXiv preprint arXiv:2109.12848},  
  eprint = {2109.12848},  
  eprinttype = {arxiv},  
  archiveprefix = {arXiv}  
}

Clone不Star,都是耍流氓 🤡 🤡 🤡

👹 Abstract of the paper

Recently, many arbitrary-oriented object detection (AOOD) methods have been proposed and attracted widespread attention in many fields. However, most of them are based on anchor-boxes or standard Gaussian heatmaps. Such label assignment strategy may not only fail to reflect the shape and direction characteristics of arbitrary-oriented objects, but also have high parameter-tuning efforts. In this paper, a novel AOOD method called General Gaussian Heatmap Labeling (GGHL) is proposed. Specifically, an anchor-free object adaptation label assignment (OLA) strategy is presented to define the positive candidates based on two-dimensional (2-D) oriented Gaussian heatmaps, which reflect the shape and direction features of arbitrary-oriented objects. Based on OLA, an oriented-boundingbox (OBB) representation component (ORC) is developed to indicate OBBs and adjust the Gaussian center prior weights to fit the characteristics of different objects adaptively through neural network learning. Moreover, a joint-optimization loss (JOL) with area normalization and dynamic confidence weighting is designed to refine the misalign optimal results of different subtasks. Extensive experiments on public datasets demonstrate that the proposed GGHL improves the AOOD performance with low parameter-tuning and time costs. Furthermore, it is generally applicable to most AOOD methods to improve their performance including lightweight models on embedded platforms.

0.News 🦞 🦀 🦑

  • 12.17 今天没有更新。感慨一句,对于一个深度学习任务而言,有一个成熟的benchmark是一件幸事也是最大的不幸,当大家乐此不疲于此,这个领域就死掉了。

  • 12.15 🤪 The trained models for DOTAv1.5 and DOTAv2.0 dataset are available. Google Drive or Baidu Disk(password: yn04)

🐾 🐾 🐾 DOTAv1.5和DOTAv2.0的权重可以下载啦。这版本的结果是没调参,没数据增强,没多尺度测试的,后续有空会再精调和加tricks,应该还会涨点。
😝 😝 😝 其实每天事儿挺多的,做科研都是见缝插针,github这边就更顾不上了,使用教程和代码注释更新慢还请见谅,过年期间会加油更新。另外,有问题可以在issues里面留言,为什么都喜欢发邮件啊,邮件经常会莫名其妙的跑到垃圾邮件里,因此可能会有延迟,实在抱歉,我打捞出来就会立即回复的。

  • 12.13 😭 改论文改的头昏脑胀,补了一堆实验和解释,改论文比写论文难产多了~/(ㄒoㄒ)/~我可以选择剖腹产吗...

  • 12.11 😁 修复了两个索引的bug。调整了学习率重新训练了,conf_thresh调到0.005,DOTA数据集精度能到79+了。顺便回复一句,总是有人问area normalization那个公式设计怎么来的,我睡觉梦到的。

  • 12.9 😳 终于收到一审的审稿意见了,感谢审稿人大大。

  • 11.22 👺 Notice. Due to a bug in the cv2.minAreaRect() function of different versions of opencv, I updated datasets_obb.py, datasets_obb_pro.py, augmentations.py, and DOTA2Train.py. Opencv supports version 4.5.3 and above. Please note the update. Thank you. Thanks @Fly-dream12 for the feedback.

不同版本opencv的cv2.minAreaRect()函数不一致且存在一些角度转换的bug (我用的低版本角度是(0,-90],新版的是[0,90],所以可能有一些bug,我全部更新统一到新版了现在。还有就是cv2.minAreaRect()函数本身的一些bug,有很多博客介绍过了我就不重复了,由于我的原版为了解决这些bug做的判断函数和新版cv2.minAreaRect()的输出不太一样,这里也有一些问题,我也修改了),我更新了datasets_obb.py, datasets_obb_pro.py, augmentations.py, DOTA2Train.py文件,全部按长边表示法计算(角度范围是[0,180)),请大家及时更新,opencv版本也请更新到4.5.3及以上。谢谢。

  • 11.21 😸 😸 Thanks @trungpham2606 for the suggestions and feedback.

  • 11.20 ❤️ 修复了一些bug,谢谢大家的建议。大家有啥问题可以在issues里面详细描述,我会及时回复,你的问题也可能帮助到其他人。

  • 11.19 😶 During label conversion, it should be noted that the vertices in the paper are in order (see the paper for details).

11.19-11.20 更新修复了标签转换脚本的一些bug (对于custom data的顶点顺序可能与DOTA不一致的问题)

  • 11.18 😺 Fixed some bugs, please update the codes

  • 🙏 🙏 🙏 11.17 Release Notes

There are still some uncompleted content that is being continuously updated. Thank you for your feedback and suggestions.

  • 🐟 🐡 11.16 The script for generating datasets in the format required by GGHL is added in ./datasets_tools/DOTA2Train.py

更新了用于生成GGHL所需格式数据集的工具(./datasets_tools/DOTA2Train.py)

  • 👾 11.15 The models for the SKU dataset are available

其他数据的权重近期会陆续上传和更新

  • 🤖 11.14 更新预告

即将更新更多的backbone和模型,以及mosaic数据增强,一周内更完。下周会更新第一版的代码注释和教程,即dataloadR/datasets_obb.py文件,主要是GGHL中最重要的标签分配策略。 另外GGHLv2.0正在准备和实验中,立个flag今年更新完。

  • 🎅 11.10 Add DCNv2 for automatic mixed precision (AMP) training.

增加了DCNv2的混合精度训练和onnx转换 (推理阶段要记得把offsets改成FP16)

  • 🐣 🐤 🐥 11.9: The model weight has been released. You can download it and put it in the ./weight folder, and then modify the weight path in test.py to test and get the results reported in the paper. The download link is given in the introduction later.

论文结果对应的模型权重可以下载了(终于发工资把网盘续上了~)

  • 🐞 11.8:I plan to write a tutorial on data preprocessing and explanation of algorithms and codes, which is expected to be launched in December

打算写一个数据预处理的教程和算法、代码的讲解,预计12月上线

  • 🦄 11.7: All updates of GGHL have been completed. Welcome to use it. If you have any questions, you can leave a message at the issue. Thank you.

1.0版本全部更新完成了,欢迎使用,有任何问题可以在issue留言,谢谢。接下来会不断更新和完善

🌈 1.Environments

Linux (Ubuntu 18.04, GCC>=5.4) & Windows (Win10)
CUDA > 11.1, Cudnn > 8.0.4

First, install CUDA, Cudnn, and Pytorch. Second, install the dependent libraries in requirements.txt.

conda install pytorch==1.8.1 torchvision==0.9.1 torchaudio==0.8.1 cudatoolkit=11.1 -c pytorch -c conda-forge   
pip install -r requirements.txt  

🌟 2.Installation

  1. git clone this repository
  2. Install the libraries in the ./lib folder
    (1) DCNv2
cd ./GGHL/lib/DCNv2/  
sh make.sh  
  1. Polygen NMS
    The poly_nms in this version is implemented using shapely and numpy libraries to ensure that it can work in different systems and environments without other dependencies. But doing so will slow down the detection speed in dense object scenes. If you want faster speed, you can compile and use the poly_iou library (C++ implementation version) in datasets_tools/DOTA_devkit. The compilation method is described in detail in DOTA_devkit .
cd datasets_tools/DOTA_devkit
sudo apt-get install swig
swig -c++ -python polyiou.i
python setup.py build_ext --inplace 

🎃 3.Datasets

  1. DOTA dataset and its devkit

(1) Training Format

You need to write a script to convert them into the train.txt file required by this repository and put them in the ./dataR folder.
For the specific format of the train.txt file, see the example in the /dataR folder.

image_path xmin,ymin,xmax,ymax,class_id,x1,y1,x2,y2,x3,y3,x4,y4,area_ratio,angle[0,180) xmin,ymin,xmax,ymax,class_id,x1,y1,x2,y2,x3,y3,x4,y4,area_ratio,angle[0,180)...

The calculation method of angle is explained in Issues #1 and our paper.

(2) Testing Format

The same as the Pascal VOC Format

(3) DataSets Files Structure

cfg.DATA_PATH = "/opt/datasets/DOTA/"
├── ...
├── JPEGImages
|   ├── 000001.png
|   ├── 000002.png
|   └── ...
├── Annotations (DOTA Dataset Format)
|   ├── 000001.txt (class_idx x1 y1 x2 y2 x3 y3 x4 y4)
|   ├── 000002.txt
|   └── ...
├── ImageSets
    ├── test.txt (testing filename)
        ├── 000001
        ├── 000002
        └── ...

There is a DOTA2Train.py file in the datasets_tools folder that can be used to generate training and test format labels. First, you need to use DOTA_devkit , the official tools of the DOTA dataset, for image and label splitting. Then, run DOTA2Train.py to convert them to the format required by GGHL. For the use of DOTA_devkit, please refer to the tutorial in the official repository.

🌠 🌠 🌠 4.Usage Example

(1) Training

python train_GGHL.py

(2) For Distributed Training

sh train_GGHL_dist.sh

(3) Testing

python test.py

☃️ ❄️ 5.Weights

1)The trained model for DOTA dataset is available from Google Drive or Baidu Disk (password: 2dm8)
Put them in. /weight folder

2)The trained model for SKU dataset is available from Google Drive or Baidu Disk(password: c3jv)

3)The trained model for SKU dataset is available from Google Drive or Baidu Disk(password: vdf5)

4)The pre-trained weights of Darknet53 on ImageNet are available from Google_Drive or Baidu_Disk(password:0blv)

  1. The trained model for DOTAv1.5 dataset is available from Google Drive or Baidu Disk(password: wxlj)

  2. The trained model for DOTAv2.0 dataset is available from Google Drive or Baidu Disk(password: dmu7)

💖 💖 💖 6.Reference

https://github.com/Peterisfar/YOLOV3
https://github.com/argusswift/YOLOv4-pytorch
https://github.com/ultralytics/yolov5
https://github.com/jinfagang/DCNv2_latest

📝 License

Copyright © 2021 Shank2358.
This project is GNU General Public License v3.0 licensed.

🤐 To be continued

💣 11.6 更新了标签分配和dataload。更新了pytorch1.10版本的支持。预告一下,下周会更新分布式训练的内容。

(预训练权重的链接在NPMMR-Det和LO-Det的仓库说明里)

🙈 正文开始前的惯例的碎碎念(可以跳过直接看正文使用说明)

投稿排队实在太慢了,三个月了还在形式审查没分配AE,555~ 先在arxiv上挂出来了。
我会尽最大努力帮助大家跑通代码和复现出接近论文报道结果的实验,因为我自己也被坑多了,好多遥感领域的论文不开源代码或者根本复现不出来,或者就是模型复杂到眼花缭乱换个数据/参数就失灵,实在是太难了。论文里关于NPMMR-Det和LO-Det的实验代码会在那两个仓库里面更新,NPMMRDet的baseline目前已经更新完了,你们可以试试看能不能跑。LO-Det的正在更新中,可以看那边的说明(11.1也更新了)。 万一有AE或者审稿人大佬看到这个仓库,跪求千万别忘了审稿啊~ 求求,希望能顺利毕业 😭 😭 😭

😸 😸 10.24 终于分配AE和审稿人了 🐌 🐌 🐌 ,不容易啊。这投稿流程可太慢了,担心能不能赶上毕业,真的是瑟瑟发抖 😭 😭 😭

🙉 🙉 关于论文超参数合实验的一些说明。

🐛 论文里报道的训练超参数都没有精调,就选的对比方法一样的默认参数,也没有选最好的epoch的结果,直接固定了最大epoch,选择最后五个epoch的平均结果。精调学习率、训练策略合最好轮次还会涨点,最近有空闲的机器我试了一下。但是我觉得像很多论文那样为了state-of-the-art(SOTA)而SOTA没有必要,所以最后没那样做,后续如果审稿意见有这个建议我可能会再修改,如果没有我会把更多的实验结果在github和arxiv上展示出来。反思自己最近的工作,确实比不上各位大佬前辈的创新想法,这点还要继续努力。由于我也是自己一路磕磕绊绊摸索着进入科研领域的,也踩过很多坑,也被各种卷王卷的透不过气,所以我想追求的是想做一些踏实的、简单实用的工作,设计一个皮实、经得起折腾的模型,而不想去卷什么SOTA( 😭 😭 😭 实话是我也卷不过。。。。)。
🐰 🐰 说一个我对目标检测的理解,请大家批评指正。在我看来,目标检测只是一个更庞大的视觉系统的入口任务而不是最终结果。我觉得大多数检测任务的目标是快速、粗略地在图像/视频中定位到目标候选区域,为后续更精细的比如分割、跟踪等其他任务服务,简化它们的输入。从这个视角来看,检测平均精度差距那么一两个点真的没论文里吹的那么重要,反而检测效率(速度)、模型的复杂度与鲁棒性、易用性(无论是对工程人员还是新入门的研究人员而言)的提升对于社区的贡献会更实际一些。最近几个月我也一直在反思自己,目标检测的初心是什么,目标检测完了然后呢,原来我写论文以为的终点很多时候只是我以为的,原来我想错了。深度学习火了这么些年,很多任务或许也是这样的吧,毕竟论文实验里的SOTA是有标准答案的考试,而它们的开花结果是一个开放性问题。这是接下来的努力方向,我相信哪怕道阻且长,行则将至,而且行而不辍,未来一定可期。

另外,请不要做伸手党,如果你们想训练自己的数据集,以下已经详细描述了GGHL的数据格式和使用说明,在tools文件夹中提供了转换脚本。我也在许多论文以外的数据集和大家提供的数据集上进行了实验,都可以正常工作,请花些时间阅读说明和issues #1中的一些解释,如果还有疑问可以在issues中留言给我,都会得到回复。我没有义务直接帮你们改代码和训练你们的数据。

Comments
  • demo

    demo

    Thank you @Shank2358 for sharing a great work. Iam trying to visualize the detection with your work. But when I normally ran the test.py file, it popped up an error about e2cnn. So can you show me the reference of e2cnn. Thank in advance.

    bug good first issue datasets 
    opened by trungpham2606 66
  • FCOS + GGHL中OLA模块代码复现问题

    FCOS + GGHL中OLA模块代码复现问题

    作者您好!感谢您分享非常棒的一项工作。

    论文中FCOS + GGHL代码细节描述不是特别清楚,想向您确认一下OLA模块的复现细节。

    FCOS + GGHL OLA模块的代码是对应dataloadR/datasets_FCOS_R.py文件吗?我看到其中正负点的采样不是采用论文中描述的计算二维高斯概率值的方式,而是根据计算的centerness的值进行采样。想请教一下您这样选择的依据是什么呢?这两种采样方式对不同检测方法的适配度会有较大差别吗?

    label assignment hyperparameter to do 
    opened by zhuyh1223 16
  • Error in validation data when training model on HRSC216 dataset

    Error in validation data when training model on HRSC216 dataset

    File "/content/drive/MyDrive/GGHL/evalR/evaluatorGGHL.py", line 237, in __calc_APs use_07_metric) # 调用voc_eval.py的函数进行计算 File "/content/drive/MyDrive/GGHL/evalR/voc_eval.py", line 122, in voc_eval recs[imagename] = parse_poly(annopath.format(imagename)) File "/content/drive/MyDrive/GGHL/evalR/voc_eval.py", line 44, in parse_poly object_struct['name'] = classes[int(splitlines[0])] ValueError: invalid literal for int() with base 10: 'ship'

    opened by TerrafYassin 13
  • Poor accuracy on HRSC2016 dataset

    Poor accuracy on HRSC2016 dataset

    Hi sir, When execute your model on HRSC2016 on data you given to me i got the following results : image

    Need i to crop sub images from original images with an overlap like DOTA dataset or there is another problem ?

    opened by TerrafYassin 4
  • 关于论文中Joint-optimization Loss的问题

    关于论文中Joint-optimization Loss的问题

    (Joint-optimization Loss中提到的area normalization 和 loss re-weighting mechanism都能理解)

    因为从PDF中由MLE得到的joint-optimization function都是最为常用的损失函数。 所以没能理解,论文中提到的:由最大似然估计(MLE)得到joint-optimization function有什么意义。

    question loss function 
    opened by Oooqf 3
  • HRSC2016 dataset to GGHL format

    HRSC2016 dataset to GGHL format

    Hello sir, please can you tell me in which format should i convert HRSC2016 dataset , need i to convert it to DOTA format or VOC format because DOTA2Train.py require labelTxt folder which is DOTA format and evaluatorGGHL.py require VOC format , can you please explain me the steps to prepare HRSC2016 dataset?

    opened by TerrafYassin 2
  • OLA问题

    OLA问题

    Hi, 作者你好。 关于论文中OLA模块进行label assignment时,第三部分关于分层以及边界有以下疑问: 1)algorithm1 给出了高斯的生成方式,其中thr阈值是OBB的边界值,但是公式7)又对边界进行了shrink,所以thr需要在shrink之后的边界处取高斯值吗?不知道这么做的目的是什么呢?类似于FCOS的中心区域采样吗?其中Tiou=0.3是等效于缩放比例是吗? 2)如果上述我理解的没有问题,那么对椭圆两个半轴进行等比例缩放,如果面对很细长的物体是否会造成采样点过少的问题(?比如arxiv2022的FCOS-R的图示: image 3)关于层级分配策略,想知道最终只用了三个层级进行匹配吗?根据stride1,stride2,stride3以及图像对角长度,或者对于FPN来说是只考虑p4,p5,p6?

    另外,关于论文OWAM是通过类似autoassign的策略进行reweight,想知道OWAM因为是直接采用了回归loss在前期学习会不会不太稳定?因为噪声较多

    期待作者的回复,谢谢

    question label assignment to do 
    opened by aiboys 2
  • Transfer Learning

    Transfer Learning

    Hello, I have a question regarding training a custom dataset.

    How can I transfer learning of some specific classes from the pre-trained weights (e.g. dota) to my custom training if my custom classes are different from the pre-trained classes?
    

    Best regards and Thank you

    opened by wafa-bouzouita 1
Releases(weights)
A PyTorch Implementation of Single Shot Scale-invariant Face Detector.

S³FD: Single Shot Scale-invariant Face Detector A PyTorch Implementation of Single Shot Scale-invariant Face Detector. Eval python wider_eval_pytorch.

carwin 235 Jan 07, 2023
MACE is a deep learning inference framework optimized for mobile heterogeneous computing platforms.

Documentation | FAQ | Release Notes | Roadmap | MACE Model Zoo | Demo | Join Us | 中文 Mobile AI Compute Engine (or MACE for short) is a deep learning i

Xiaomi 4.7k Dec 29, 2022
Bravia core script for python

Bravia-Core-Script You need to have a mandatory account If this L3 does not work, try another L3. enjoy

5 Dec 26, 2021
Discriminative Region Suppression for Weakly-Supervised Semantic Segmentation

Discriminative Region Suppression for Weakly-Supervised Semantic Segmentation (AAAI 2021) Official pytorch implementation of our paper: Discriminative

Beom 74 Dec 27, 2022
Ranger - a synergistic optimizer using RAdam (Rectified Adam), Gradient Centralization and LookAhead in one codebase

Ranger-Deep-Learning-Optimizer Ranger - a synergistic optimizer combining RAdam (Rectified Adam) and LookAhead, and now GC (gradient centralization) i

Less Wright 1.1k Dec 21, 2022
Example scripts for the detection of lanes using the ultra fast lane detection model in Tensorflow Lite.

TFlite Ultra Fast Lane Detection Inference Example scripts for the detection of lanes using the ultra fast lane detection model in Tensorflow Lite. So

Ibai Gorordo 12 Aug 27, 2022
Face uncertainty quantification or estimation using PyTorch.

Face-uncertainty-pytorch This is a demo code of face uncertainty quantification or estimation using PyTorch. The uncertainty of face recognition is af

Kaen 3 Sep 16, 2022
NeuroLKH: Combining Deep Learning Model with Lin-Kernighan-Helsgaun Heuristic for Solving the Traveling Salesman Problem

NeuroLKH: Combining Deep Learning Model with Lin-Kernighan-Helsgaun Heuristic for Solving the Traveling Salesman Problem Liang Xin, Wen Song, Zhiguang

xinliangedu 33 Dec 27, 2022
PERIN is Permutation-Invariant Semantic Parser developed for MRP 2020

PERIN: Permutation-invariant Semantic Parsing David Samuel & Milan Straka Charles University Faculty of Mathematics and Physics Institute of Formal an

ÚFAL 40 Jan 04, 2023
Prometheus exporter for Cisco Unified Computing System (UCS) Manager

prometheus-ucs-exporter Overview Use metrics from the UCS API to export relevant metrics to Prometheus This repository is a fork of Drew Stinnett's or

Marshall Wace 6 Nov 07, 2022
Human segmentation models, training/inference code, and trained weights, implemented in PyTorch

Human-Segmentation-PyTorch Human segmentation models, training/inference code, and trained weights, implemented in PyTorch. Supported networks UNet: b

Thuy Ng 474 Dec 19, 2022
Preprocessed Datasets for our Multimodal NER paper

Unified Multimodal Transformer (UMT) for Multimodal Named Entity Recognition (MNER) Two MNER Datasets and Codes for our ACL'2020 paper: Improving Mult

76 Dec 21, 2022
A Pytorch loader for MVTecAD dataset.

MVTecAD A Pytorch loader for MVTecAD dataset. It strictly follows the code style of common Pytorch datasets, such as torchvision.datasets.CIFAR10. The

Jiyuan 1 Dec 27, 2021
A library for performing coverage guided fuzzing of neural networks

TensorFuzz: Coverage Guided Fuzzing for Neural Networks This repository contains a library for performing coverage guided fuzzing of neural networks,

Brain Research 195 Dec 28, 2022
Direct design of biquad filter cascades with deep learning by sampling random polynomials.

IIRNet Direct design of biquad filter cascades with deep learning by sampling random polynomials. Usage git clone https://github.com/csteinmetz1/IIRNe

Christian J. Steinmetz 55 Nov 02, 2022
Mask2Former: Masked-attention Mask Transformer for Universal Image Segmentation in TensorFlow 2

Mask2Former: Masked-attention Mask Transformer for Universal Image Segmentation in TensorFlow 2 Bowen Cheng, Ishan Misra, Alexander G. Schwing, Alexan

Phan Nguyen 1 Dec 16, 2021
Transformer model implemented with Pytorch

transformer-pytorch Transformer model implemented with Pytorch Attention is all you need-[Paper] Architecture Self-Attention self_attention.py class

Mingu Kang 12 Sep 03, 2022
An implementation of Equivariant e2 convolutional kernals into a convolutional self attention network, applied to radio astronomy data.

EquivariantSelfAttention An implementation of Equivariant e2 convolutional kernals into a convolutional self attention network, applied to radio astro

2 Nov 09, 2021
DL course co-developed by YSDA, HSE and Skoltech

Deep learning course This repo supplements Deep Learning course taught at YSDA and HSE @fall'21. For previous iteration visit the spring21 branch. Lec

Yandex School of Data Analysis 1.3k Dec 30, 2022
This is the code repository for the paper A hierarchical semantic segmentation framework for computer-vision-based bridge column damage detection

Bridge-damage-segmentation This is the code repository for the paper A hierarchical semantic segmentation framework for computer-vision-based bridge c

Jingxiao Liu 5 Dec 07, 2022