A Robust Non-IoU Alternative to Non-Maxima Suppression in Object Detection

Overview

Confluence: A Robust Non-IoU Alternative to Non-Maxima Suppression in Object Detection

1. 介绍

image

用以替代 NMS,在所有 bbox 中挑选出最优的集合。 NMS 仅考虑了 bbox 的得分,然后根据 IOU 来去除重叠的 bbox。而 Confluence 则是利用曼哈顿距离作为 bbox 之间的重合度,并根据置信度加权的曼哈顿距离还作为最优 bbox 的选择依据。

2. 算法原理

2.1 曼哈顿距离

两点的曼哈顿距离就是坐标值插的 L1 范数:

image

推广到两个 bbox 对的哈曼顿距离则为:

image

该算法便是以曼哈顿距离作为两个 bbox 的重合度,曼哈顿距离小于一定值的的 bbox 则被认为是一个 cluster。

2.2 归一化

因为 bbox 有个各样的 size 和 position,所以直接计算曼哈顿距离就没有可比性,没有标准的度量。所以需要对其进行归一化:

image

2.3 置信度加权曼哈顿距离

NMS在去除重合 bbox 是仅考虑其置信度的高低,Condluence 则同时考虑了曼哈顿距离和置信度,构成一个置信度加权曼哈顿距离:

image

3. 算法实现

image

算法:

(1)针对每个类别挑出属于该类别的 bbox 集合 B

(2)遍历 B 中所有的 bbox bi,并计算 bi 和其他 boox的 曼哈顿距离 p,并归一化

2.1 选出 p < 2 的集合,作为一个 cluster,并计算加权曼哈顿距离 wp。 

2.2 在该 cluster 中挑选出最小的 wp 作为 bi 的 wp。 

(3)遍历完毕后,挑出 wp 最小的 bi 作为最优 bbox,添加进最终结果集合中,并将其从 B 去除

(4)把与最优 bbox 的曼哈顿距离小于阈值 MD 的的 bbox 从 B 中去除

(5)不断重复 (2) - (4),每次都选出一个最优 bbox,知道 B 为空

注意:

(1)原文伪代码第 5 行:optimalConfuence 初始化成一个比较大的值就可以,不一定要是 Ip

(2)原文伪代码第 12 行:应该是 Proximity / si

4. 实验结果

image

5. 代码解析

5.1 YOLOv3/4 的后处理

这个接口可以直接处理 YOLOv3/4 的 yolo 层的输出进行后处理

confluence_process(prediction, conf_thres=0.1, wp_thres=0.6)

支持多标签和单标签,并把数据重组后进行 confluence/NMS 处理

# Detections matrix nx6 (xyxy, conf, cls)
if multi_label:
    i, j = (x[:, 5:] > conf_thres).nonzero().t()
    x = torch.cat((box[i], x[i, j + 5, None], j[:, None].float()), 1)
else:  # best class only
    conf, j = x[:, 5:].max(1, keepdim=True)
    x = torch.cat((box, conf, j.float()), 1)[conf.view(-1) > conf_thres]

5.2 Confluence 算法

confluence(prediction, class_num, wp_thres=0.6)

给所有目标添加上序号

index = np.arange(0, len(prediction), 1).reshape(-1,1)
infos = np.concatenate((prediction, index), 1)

不同类别单独处理,并遍历所有的剩余目标集合 B,直到集合为空,对应上面伪代码的(1)-(2)

for c in range(class_num):       
    pcs = infos[infos[:, 5] == c]             
    while (len(pcs)):                      
        n = len(pcs)       
        xs = pcs[:, [0, 2]]
        ys = pcs[:, [1, 3]]             
        ps = []        
        # 遍历 pcs,计算每一个box 和其余 box 的 p 值,然后聚类成簇,再根据 wp 挑出 best
        confluence_min = 10000
        best = None
        for i, pc in enumerate(pcs):

计算所有目标与其他目标的曼和顿距离 p 和加权曼哈顿距离 wp,p < 2 的目标作为一个 cluster,其中最小的 wp 作为该 cluster 的 wp

index_other = [j for j in range(n) if j!= i]
x_t = xs[i]
x_t = np.tile(x_t, (n-1, 1))
x_other = xs[index_other]
x_all = np.concatenate((x_t, x_other), 1)
.
.
.
# wp
wp = p / pc[4]
wp = wp[p < 2]

if (len(wp) == 0):
    value = 0
else:
    value = wp.min()

选出最小的 wp,确定目标

# select the bbox which has the smallest wp as the best bbox
if (value < confluence_min):
   confluence_min = value
   best = i  

然后把与目标的曼哈顿距离小于阈值的目标和本身都从集合 B 中去除

keep.append(int(pcs[best][6])) 
if (len(ps) > 0):               
    p = ps[best]
    index_ = np.where(p < wp_thres)[0]
    index_ = [i if i < best else i +1 for i in index_]
else:
    index_ = []
    
# delect the bboxes whose Manhattan Distance is below the predefined MD
index_eff = [j for j in range(n) if (j != best and j not in index_)]            
pcs = pcs[index_eff]

最后继续重复遍历集合 B,直到集合为空。

仓库里我放了一张测试照片和原始检测结果,大家可以直接用来调试 confluence 函数。

Credits:

https://arxiv.org/pdf/2012.00257.pdf

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech Jaehyeon Kim, Jungil Kong, and Juhee Son In our rece

Jaehyeon Kim 1.7k Jan 08, 2023
Object detection and instance segmentation toolkit based on PaddlePaddle.

Object detection and instance segmentation toolkit based on PaddlePaddle.

9.3k Jan 02, 2023
KIND: an Italian Multi-Domain Dataset for Named Entity Recognition

KIND (Kessler Italian Named-entities Dataset) KIND is an Italian dataset for Named-Entity Recognition. It contains more than one million tokens with t

Digital Humanities 5 Jun 21, 2022
GNNAdvisor: An Efficient Runtime System for GNN Acceleration on GPUs

GNNAdvisor: An Efficient Runtime System for GNN Acceleration on GPUs [Paper, Slides, Video Talk] at USENIX OSDI'21 @inproceedings{GNNAdvisor, title=

YUKE WANG 47 Jan 03, 2023
This is an official implementation for "DeciWatch: A Simple Baseline for 10x Efficient 2D and 3D Pose Estimation"

DeciWatch: A Simple Baseline for 10× Efficient 2D and 3D Pose Estimation This repo is the official implementation of "DeciWatch: A Simple Baseline for

117 Dec 24, 2022
Python wrapper to access the amazon selling partner API

PYTHON-AMAZON-SP-API Amazon Selling-Partner API If you have questions, please join on slack Contributions very welcome! Installation pip install pytho

Michael Primke 330 Jan 06, 2023
Python implementation of the multistate Bennett acceptance ratio (MBAR)

pymbar Python implementation of the multistate Bennett acceptance ratio (MBAR) method for estimating expectations and free energy differences from equ

Chodera lab // Memorial Sloan Kettering Cancer Center 169 Dec 02, 2022
Capsule endoscopy detection DACON challenge

capsule_endoscopy_detection (DACON Challenge) Overview Yolov5, Yolor, mmdetection기반의 모델을 사용 (총 11개 모델 앙상블) 모든 모델은 학습 시 Pretrained Weight을 yolov5, yolo

MAILAB 11 Nov 25, 2022
Implementation of "Debiasing Item-to-Item Recommendations With Small Annotated Datasets" (RecSys '20)

Debiasing Item-to-Item Recommendations With Small Annotated Datasets This is the code for our RecSys '20 paper. Other materials can be found here: Ful

Microsoft 34 Aug 10, 2022
Distributionally robust neural networks for group shifts

Distributionally Robust Neural Networks for Group Shifts: On the Importance of Regularization for Worst-Case Generalization This code implements the g

151 Dec 25, 2022
Deep Reinforcement Learning based Trading Agent for Bitcoin

Deep Trading Agent Deep Reinforcement Learning based Trading Agent for Bitcoin using DeepSense Network for Q function approximation. For complete deta

Kartikay Garg 669 Dec 29, 2022
Optical Character Recognition + Instance Segmentation for russian and english languages

Распознавание рукописного текста в школьных тетрадях Соревнование, проводимое в рамках олимпиады НТО, разработанное Сбером. Платформа ODS. Результаты

Gerasimov Maxim 21 Dec 19, 2022
Summary Explorer is a tool to visually explore the state-of-the-art in text summarization.

Summary Explorer Summary Explorer is a tool to visually inspect the summaries from several state-of-the-art neural summarization models across multipl

Webis 42 Aug 14, 2022
PyTorch implementation of the paper: "Preference-Adaptive Meta-Learning for Cold-Start Recommendation", IJCAI, 2021.

PAML PyTorch implementation of the paper: "Preference-Adaptive Meta-Learning for Cold-Start Recommendation", IJCAI, 2021. (Continuously updating ) Int

15 Nov 18, 2022
Learning from History: Modeling Temporal Knowledge Graphs with Sequential Copy-Generation Networks

CyGNet This repository reproduces the AAAI'21 paper “Learning from History: Modeling Temporal Knowledge Graphs with Sequential Copy-Generation Network

CunchaoZ 89 Jan 03, 2023
Code for paper " AdderNet: Do We Really Need Multiplications in Deep Learning?"

AdderNet: Do We Really Need Multiplications in Deep Learning? This code is a demo of CVPR 2020 paper AdderNet: Do We Really Need Multiplications in De

HUAWEI Noah's Ark Lab 915 Jan 01, 2023
Differentiable rasterization applied to 3D model simplification tasks

nvdiffmodeling Differentiable rasterization applied to 3D model simplification tasks, as described in the paper: Appearance-Driven Automatic 3D Model

NVIDIA Research Projects 336 Dec 30, 2022
https://sites.google.com/cornell.edu/recsys2021tutorial

Counterfactual Learning and Evaluation for Recommender Systems (RecSys'21 Tutorial) Materials for "Counterfactual Learning and Evaluation for Recommen

yuta-saito 45 Nov 10, 2022
Code for the paper "Query Embedding on Hyper-relational Knowledge Graphs"

Query Embedding on Hyper-Relational Knowledge Graphs This repository contains the code used for the experiments in the paper Query Embedding on Hyper-

DimitrisAlivas 19 Jul 26, 2022
Data manipulation and transformation for audio signal processing, powered by PyTorch

torchaudio: an audio library for PyTorch The aim of torchaudio is to apply PyTorch to the audio domain. By supporting PyTorch, torchaudio follows the

1.9k Dec 28, 2022