A Robust Non-IoU Alternative to Non-Maxima Suppression in Object Detection

Overview

Confluence: A Robust Non-IoU Alternative to Non-Maxima Suppression in Object Detection

1. 介绍

image

用以替代 NMS,在所有 bbox 中挑选出最优的集合。 NMS 仅考虑了 bbox 的得分,然后根据 IOU 来去除重叠的 bbox。而 Confluence 则是利用曼哈顿距离作为 bbox 之间的重合度,并根据置信度加权的曼哈顿距离还作为最优 bbox 的选择依据。

2. 算法原理

2.1 曼哈顿距离

两点的曼哈顿距离就是坐标值插的 L1 范数:

image

推广到两个 bbox 对的哈曼顿距离则为:

image

该算法便是以曼哈顿距离作为两个 bbox 的重合度,曼哈顿距离小于一定值的的 bbox 则被认为是一个 cluster。

2.2 归一化

因为 bbox 有个各样的 size 和 position,所以直接计算曼哈顿距离就没有可比性,没有标准的度量。所以需要对其进行归一化:

image

2.3 置信度加权曼哈顿距离

NMS在去除重合 bbox 是仅考虑其置信度的高低,Condluence 则同时考虑了曼哈顿距离和置信度,构成一个置信度加权曼哈顿距离:

image

3. 算法实现

image

算法:

(1)针对每个类别挑出属于该类别的 bbox 集合 B

(2)遍历 B 中所有的 bbox bi,并计算 bi 和其他 boox的 曼哈顿距离 p,并归一化

2.1 选出 p < 2 的集合,作为一个 cluster,并计算加权曼哈顿距离 wp。 

2.2 在该 cluster 中挑选出最小的 wp 作为 bi 的 wp。 

(3)遍历完毕后,挑出 wp 最小的 bi 作为最优 bbox,添加进最终结果集合中,并将其从 B 去除

(4)把与最优 bbox 的曼哈顿距离小于阈值 MD 的的 bbox 从 B 中去除

(5)不断重复 (2) - (4),每次都选出一个最优 bbox,知道 B 为空

注意:

(1)原文伪代码第 5 行:optimalConfuence 初始化成一个比较大的值就可以,不一定要是 Ip

(2)原文伪代码第 12 行:应该是 Proximity / si

4. 实验结果

image

5. 代码解析

5.1 YOLOv3/4 的后处理

这个接口可以直接处理 YOLOv3/4 的 yolo 层的输出进行后处理

confluence_process(prediction, conf_thres=0.1, wp_thres=0.6)

支持多标签和单标签,并把数据重组后进行 confluence/NMS 处理

# Detections matrix nx6 (xyxy, conf, cls)
if multi_label:
    i, j = (x[:, 5:] > conf_thres).nonzero().t()
    x = torch.cat((box[i], x[i, j + 5, None], j[:, None].float()), 1)
else:  # best class only
    conf, j = x[:, 5:].max(1, keepdim=True)
    x = torch.cat((box, conf, j.float()), 1)[conf.view(-1) > conf_thres]

5.2 Confluence 算法

confluence(prediction, class_num, wp_thres=0.6)

给所有目标添加上序号

index = np.arange(0, len(prediction), 1).reshape(-1,1)
infos = np.concatenate((prediction, index), 1)

不同类别单独处理,并遍历所有的剩余目标集合 B,直到集合为空,对应上面伪代码的(1)-(2)

for c in range(class_num):       
    pcs = infos[infos[:, 5] == c]             
    while (len(pcs)):                      
        n = len(pcs)       
        xs = pcs[:, [0, 2]]
        ys = pcs[:, [1, 3]]             
        ps = []        
        # 遍历 pcs,计算每一个box 和其余 box 的 p 值,然后聚类成簇,再根据 wp 挑出 best
        confluence_min = 10000
        best = None
        for i, pc in enumerate(pcs):

计算所有目标与其他目标的曼和顿距离 p 和加权曼哈顿距离 wp,p < 2 的目标作为一个 cluster,其中最小的 wp 作为该 cluster 的 wp

index_other = [j for j in range(n) if j!= i]
x_t = xs[i]
x_t = np.tile(x_t, (n-1, 1))
x_other = xs[index_other]
x_all = np.concatenate((x_t, x_other), 1)
.
.
.
# wp
wp = p / pc[4]
wp = wp[p < 2]

if (len(wp) == 0):
    value = 0
else:
    value = wp.min()

选出最小的 wp,确定目标

# select the bbox which has the smallest wp as the best bbox
if (value < confluence_min):
   confluence_min = value
   best = i  

然后把与目标的曼哈顿距离小于阈值的目标和本身都从集合 B 中去除

keep.append(int(pcs[best][6])) 
if (len(ps) > 0):               
    p = ps[best]
    index_ = np.where(p < wp_thres)[0]
    index_ = [i if i < best else i +1 for i in index_]
else:
    index_ = []
    
# delect the bboxes whose Manhattan Distance is below the predefined MD
index_eff = [j for j in range(n) if (j != best and j not in index_)]            
pcs = pcs[index_eff]

最后继续重复遍历集合 B,直到集合为空。

仓库里我放了一张测试照片和原始检测结果,大家可以直接用来调试 confluence 函数。

Credits:

https://arxiv.org/pdf/2012.00257.pdf

Neuron Merging: Compensating for Pruned Neurons (NeurIPS 2020)

Neuron Merging: Compensating for Pruned Neurons Pytorch implementation of Neuron Merging: Compensating for Pruned Neurons, accepted at 34th Conference

Woojeong Kim 33 Dec 30, 2022
🏎️ Accelerate training and inference of 🤗 Transformers with easy to use hardware optimization tools

Hugging Face Optimum 🤗 Optimum is an extension of 🤗 Transformers, providing a set of performance optimization tools enabling maximum efficiency to t

Hugging Face 842 Dec 30, 2022
This is the implementation of the paper "Self-supervised Outdoor Scene Relighting"

Self-supervised Outdoor Scene Relighting This is the implementation of the paper "Self-supervised Outdoor Scene Relighting". The model is implemented

Ye Yu 24 Dec 17, 2022
The code for the CVPR 2021 paper Neural Deformation Graphs, a novel approach for globally-consistent deformation tracking and 3D reconstruction of non-rigid objects.

Neural Deformation Graphs Project Page | Paper | Video Neural Deformation Graphs for Globally-consistent Non-rigid Reconstruction Aljaž Božič, Pablo P

Aljaz Bozic 134 Dec 16, 2022
Controlling Hill Climb Racing with Hand Tacking

Controlling Hill Climb Racing with Hand Tacking Opened Palm for Gas Closed Palm for Brake

Rohit Ingole 3 Jan 18, 2022
Multi-Stage Spatial-Temporal Convolutional Neural Network (MS-GCN)

Multi-Stage Spatial-Temporal Convolutional Neural Network (MS-GCN) This code implements the skeleton-based action segmentation MS-GCN model from Autom

Benjamin Filtjens 8 Nov 29, 2022
A general python framework for visual object tracking and video object segmentation, based on PyTorch

PyTracking A general python framework for visual object tracking and video object segmentation, based on PyTorch. 📣 Two tracking/VOS papers accepted

2.6k Jan 04, 2023
HIVE: Evaluating the Human Interpretability of Visual Explanations

HIVE: Evaluating the Human Interpretability of Visual Explanations Project Page | Paper This repo provides the code for HIVE, a human evaluation frame

Princeton Visual AI Lab 16 Dec 13, 2022
Predicting Tweet Sentiment Maching Learning and streamlit

Predicting-Tweet-Sentiment-Maching-Learning-and-streamlit (I prefere using Visual Studio Code ) Open the folder in VS Code Run the first cell in requi

1 Nov 20, 2021
Code for approximate graph reduction techniques for cardinality-based DSFM, from paper

SparseCard Code for approximate graph reduction techniques for cardinality-based DSFM, from paper "Approximate Decomposable Submodular Function Minimi

Nate Veldt 1 Nov 25, 2022
An implementation of "Optimal Textures: Fast and Robust Texture Synthesis and Style Transfer through Optimal Transport"

Optex An implementation of Optimal Textures: Fast and Robust Texture Synthesis and Style Transfer through Optimal Transport for TU Delft CS4240. You c

Hans Brouwer 33 Jan 05, 2023
Codes for CVPR2021 paper "PWCLO-Net: Deep LiDAR Odometry in 3D Point Clouds Using Hierarchical Embedding Mask Optimization"

PWCLO-Net: Deep LiDAR Odometry in 3D Point Clouds Using Hierarchical Embedding Mask Optimization (CVPR 2021) This is the official implementation of PW

Intelligent Robotics and Machine Vision Lab 42 Dec 18, 2022
[CVPRW 21] "BNN - BN = ? Training Binary Neural Networks without Batch Normalization", Tianlong Chen, Zhenyu Zhang, Xu Ouyang, Zechun Liu, Zhiqiang Shen, Zhangyang Wang

BNN - BN = ? Training Binary Neural Networks without Batch Normalization Codes for this paper BNN - BN = ? Training Binary Neural Networks without Bat

VITA 40 Dec 30, 2022
FIRM-AFL is the first high-throughput greybox fuzzer for IoT firmware.

FIRM-AFL FIRM-AFL is the first high-throughput greybox fuzzer for IoT firmware. FIRM-AFL addresses two fundamental problems in IoT fuzzing. First, it

356 Dec 23, 2022
Fully Convolutional Networks for Semantic Segmentation by Jonathan Long*, Evan Shelhamer*, and Trevor Darrell. CVPR 2015 and PAMI 2016.

Fully Convolutional Networks for Semantic Segmentation This is the reference implementation of the models and code for the fully convolutional network

Evan Shelhamer 3.2k Jan 08, 2023
PyTorch code for our ECCV 2020 paper "Single Image Super-Resolution via a Holistic Attention Network"

HAN PyTorch code for our ECCV 2020 paper "Single Image Super-Resolution via a Holistic Attention Network" This repository is for HAN introduced in the

五维空间 140 Nov 23, 2022
A sequence of Jupyter notebooks featuring the 12 Steps to Navier-Stokes

CFD Python Please cite as: Barba, Lorena A., and Forsyth, Gilbert F. (2018). CFD Python: the 12 steps to Navier-Stokes equations. Journal of Open Sour

Barba group 2.6k Dec 30, 2022
Efficient face emotion recognition in photos and videos

This repository contains code of face emotion recognition that was developed in the RSF (Russian Science Foundation) project no. 20-71-10010 (Efficien

Andrey Savchenko 239 Jan 04, 2023
Python program that works as a contact list

Lista de Contatos Programa em Python que funciona como uma lista de contatos. Features Adicionar novo contato Remover contato Atualizar contato Pesqui

Victor B. Lino 3 Dec 16, 2021
BossNAS: Exploring Hybrid CNN-transformers with Block-wisely Self-supervised Neural Architecture Search

BossNAS This repository contains PyTorch evaluation code, retraining code and pretrained models of our paper: BossNAS: Exploring Hybrid CNN-transforme

Changlin Li 127 Dec 26, 2022