Alpha-IoU: A Family of Power Intersection over Union Losses for Bounding Box Regression

Overview

Alpha-IoU: A Family of Power Intersection over Union Losses for Bounding Box Regression

YOLOv5 with alpha-IoU losses implemented in PyTorch.

Example results on the test set of PASCAL VOC 2007 using YOLOv5s trained by the vanilla IoU loss (top row) and the alpha-IoU loss with alpha=3 (bottom row). The alpha-IoU loss performs better than the vanilla IoU loss because it can localize objects more accurately (image 1 and 2), thus can detect more true positive objects (image 3 to 5) and fewer false positive objects (image 6 and 7).

Example results on the val set of MS COCO 2017 using YOLOv5s trained by the vanilla IoU loss (top row) and the alpha-IoU loss with alpha=3 (bottom row). The alpha-IoU loss performs better than the vanilla IoU loss because it can localize objects more accurately (image 1), thus can detect more true positive objects (image 2 to 5) and fewer false positive objects (image 4 to 7). Note that image 4 and 5 detect both more true positive and fewer false positive objects.

Citation

If you use our method, please consider citing:

@inproceedings{Jiabo_Alpha-IoU,
  author    = {He, Jiabo and Erfani, Sarah and Ma, Xingjun and Bailey, James and Chi, Ying and Hua, Xian-Sheng},
  title     = {Alpha-IoU: A Family of Power Intersection over Union Losses for Bounding Box Regression},
  booktitle = {NeurIPS},
  year      = {2021},
}

Modifications

This repository is a fork of ultralytics/yolov5, with an implementation of alpha-IoU losses while keeping the code as close to the original as possible.

Alpha-IoU Losses

Alpha-IoU losses can be configured in Line 131 of utils/loss.py, functionesd as 'bbox_alpha_iou'. The alpha values and types of losses (e.g., IoU, GIoU, DIoU, CIoU) can be selected in this function, which are defined in utils/general.py. Note that we should use a small constant epsilon to avoid torch.pow(0, alpha) or denominator=0.

Install

Python>=3.6.0 is required with all requirements.txt installed including PyTorch>=1.7:

$ git clone https://github.com/Jacobi93/Alpha-IoU
$ cd Alpha-IoU
$ pip install -r requirements.txt

Configurations

Configuration files can be found in data. We do not change either 'voc.yaml' or 'coco.yaml' used in the original repository. However, we could do more experiments. E.g.,

voc25.yaml # randomly use 25% PASCAL VOC as the training set
voc50.yaml # randomly use 50% PASCAL VOC as the training set

Code for generating different small training sets is in generate_small_sets.py. Code for generating different noisy labels is in generate_noisy_labels.py, and we should change the 'img2label_paths' function in utils/datasets.py accordingly.

Implementation Commands

For detailed installation instruction and network training options, please take a look at the README file or issue of ultralytics/yolov5. Following are sample commands we used for training and testing YOLOv5 with alpha-IoU, with more samples in instruction.txt.

python train.py --data voc.yaml --hyp hyp.scratch.yaml --cfg yolov5s.yaml --batch-size 64 --epochs 300 --device '0'
python test.py --data voc.yaml --img 640 --conf 0.001 --weights 'runs/train/voc_yolov5s_iou/weights/best.pt' --device '0'
python detect.py --source ../VOC/images/detect500 --weights 'runs/train/voc_yolov5s_iou/weights/best.pt' --conf 0.25

We can also randomly generate some images for detection and visualization results in generate_detect_images.py.

Pretrained Weights

Here are some pretrained models using the configurations in this repository, with alpha=3 in all experiments. Details of these pretrained models can be found in runs/train. All results are tested using 'weights/best.pt' for each experiment. It is a very simple yet effective method so that people is able to quickly apply our method to existing models following the 'bbox_alpha_iou' function in utils/general.py. Note that YOLOv5 has been updated for many versions and all pretrained models in this repository are obtained based on the YOLOv5 version 4.0, where details of all versions for YOLOv5 can be found. Researchers are also welcome to apply our method to other object detection models, e.g., Faster R-CNN, DETR, etc.

Owner
Jacobi(Jiabo He)
Jacobi(Jiabo He)
MOOSE (Multi-organ objective segmentation) a data-centric AI solution that generates multilabel organ segmentations to facilitate systemic TB whole-person research

MOOSE (Multi-organ objective segmentation) a data-centric AI solution that generates multilabel organ segmentations to facilitate systemic TB whole-person research.The pipeline is based on nn-UNet an

QIMP team 30 Jan 01, 2023
RTS3D: Real-time Stereo 3D Detection from 4D Feature-Consistency Embedding Space for Autonomous Driving

RTS3D: Real-time Stereo 3D Detection from 4D Feature-Consistency Embedding Space for Autonomous Driving (AAAI2021). RTS3D is efficiency and accuracy s

71 Nov 29, 2022
ProjectOxford-ClientSDK - This repo has moved :house: Visit our website for the latest SDKs & Samples

This project has moved 🏠 We heard your feedback! This repo has been deprecated and each project has moved to a new home in a repo scoped by API and p

Microsoft 970 Nov 28, 2022
Implementation of Memory-Efficient Neural Networks with Multi-Level Generation, ICCV 2021

Memory-Efficient Multi-Level In-Situ Generation (MLG) By Jiaqi Gu, Hanqing Zhu, Chenghao Feng, Mingjie Liu, Zixuan Jiang, Ray T. Chen and David Z. Pan

Jiaqi Gu 2 Jan 04, 2022
Pytorch ImageNet1k Loader with Bounding Boxes.

ImageNet 1K Bounding Boxes For some experiments, you might wanna pass only the background of imagenet images vs passing only the foreground. Here, I'v

Amin Ghiasi 11 Oct 15, 2022
Hierarchical Memory Matching Network for Video Object Segmentation (ICCV 2021)

Hierarchical Memory Matching Network for Video Object Segmentation Hongje Seong, Seoung Wug Oh, Joon-Young Lee, Seongwon Lee, Suhyeon Lee, Euntai Kim

Hongje Seong 72 Dec 14, 2022
EquiBind: Geometric Deep Learning for Drug Binding Structure Prediction

EquiBind: geometric deep learning for fast predictions of the 3D structure in which a small molecule binds to a protein

Hannes Stärk 355 Jan 03, 2023
Video2x - A lossless video/GIF/image upscaler achieved with waifu2x, Anime4K, SRMD and RealSR.

Official Discussion Group (Telegram): https://t.me/video2x A Discord server is also available. Please note that most developers are only on Telegram.

K4YT3X 5.9k Dec 31, 2022
My published benchmark for a Kaggle Simulations Competition

Lux AI Working Title Bot Please refer to the Kaggle notebook for the comment section. The comment section contains my explanation on my code structure

Tong Hui Kang 29 Aug 22, 2022
Localization Distillation for Object Detection

Localization Distillation for Object Detection This repo is based on mmDetection. This is the code for our paper: Localization Distillation

274 Dec 26, 2022
The code for the NeurIPS 2021 paper "A Unified View of cGANs with and without Classifiers".

Energy-based Conditional Generative Adversarial Network (ECGAN) This is the code for the NeurIPS 2021 paper "A Unified View of cGANs with and without

sianchen 22 May 28, 2022
GeDML is an easy-to-use generalized deep metric learning library

GeDML is an easy-to-use generalized deep metric learning library

Borui Zhang 32 Dec 05, 2022
Code for "Causal autoregressive flows" - AISTATS, 2021

Code for "Causal Autoregressive Flow" This repository contains code to run and reproduce experiments presented in Causal Autoregressive Flows, present

Ricardo Pio Monti 35 Dec 16, 2022
Tree Nested PyTorch Tensor Lib

DI-treetensor treetensor is a generalized tree-based tensor structure mainly developed by OpenDILab Contributors. Almost all the operation can be supp

OpenDILab 167 Dec 29, 2022
Temporal Knowledge Graph Reasoning Triggered by Memories

MTDM Temporal Knowledge Graph Reasoning Triggered by Memories To alleviate the time dependence, we propose a memory-triggered decision-making (MTDM) n

4 Sep 25, 2022
Assginment for UofT CSC420: Intro to Image Understanding

Run the code Open edge_detection.ipynb in google colab. Upload image1.jpg,image2.jpg and my_image.jpg to '/content/drive/My Drive'. chooose 'Run all'

Ziyi-Zhou 1 Feb 24, 2022
[NeurIPS 2021] Deceive D: Adaptive Pseudo Augmentation for GAN Training with Limited Data

Deceive D: Adaptive Pseudo Augmentation for GAN Training with Limited Data (NeurIPS 2021) This repository will provide the official PyTorch implementa

Liming Jiang 238 Nov 25, 2022
Code repository for Self-supervised Structure-sensitive Learning, CVPR'17

Self-supervised Structure-sensitive Learning (SSL) Ke Gong, Xiaodan Liang, Xiaohui Shen, Liang Lin, "Look into Person: Self-supervised Structure-sensi

Clay Gong 219 Dec 29, 2022
This is the source code of the solver used to compete in the International Timetabling Competition 2019.

ITC2019 Solver This is the source code of the solver used to compete in the International Timetabling Competition 2019. Building .NET Core (2.1 or hig

Edon Gashi 8 Jan 22, 2022
Ludwig is a toolbox that allows to train and evaluate deep learning models without the need to write code.

Translated in 🇰🇷 Korean/ Ludwig is a toolbox that allows users to train and test deep learning models without the need to write code. It is built on

Ludwig 8.7k Dec 31, 2022