null

Overview

CP-Cluster

Confidence Propagation Cluster aims to replace NMS-based methods as a better box fusion framework in 2D/3D Object detection, Instance Segmentation:

[Confidence Propagation Cluster: Unleash the Full Potential of Object Detectors](arxivlink to do),
Yichun Shen*, Wanli Jiang*, Zhen Xu, Rundong Li, Junghyun Kwon, Siyi Li,

Contact: [email protected]. Welcome for any questions and comments!

Abstract

It’s been a long history that most object detection methods obtain objects by using the non-maximum suppression(NMS) and its improved versions like Soft-NMS to remove redundant bounding boxes. We challenge those NMS-based methods from three aspects: 1) The bounding box with highest confidence value may not be the true positive having the biggest overlap with the ground-truth box. 2) Not only suppression is required for redundant boxes, but also confidence enhancement is needed for those true positives. 3) Sorting candidate boxes by confidence values is not necessary so that full parallelism is achievable.

Inspired by belief propagation (BP), we propose the Confidence Propagation Cluster (CP-Cluster) to replace NMS-based methods, which is fully parallelizable as well as better in accuracy. In CP-Cluster, we borrow the message passing mechanism from BP to penalize redundant boxes and enhance true positives simultaneously in an iterative way until convergence. We verified the effectiveness of CP-Cluster by applying it to various mainstream detectors such as FasterRCNN, SSD, FCOS, YOLOv3, YOLOv5, Centernet etc. Experiments on MS COCO show that our plug and play method, without retraining detectors, is able to steadily improve average mAP of all those state-of-the-art models with a clear margin from 0.2 to 1.9 respectively when compared with NMS-based methods.

Highlights

  • Better accuracy: Compared with all previous NMS-based methods, CP-Cluster manages to achieve better accuracy

  • Fully parallelizable: No box sorting is required, and each candidate box can be handled separately when propagating confidence messages

Main results

Detectors from MMDetection on COCO val/test-dev

Method NMS Soft-NMS CP-Cluster
FRcnn-fpn50 38.4 / 38.7 39.0 / 39.2 39.2 / 39.4
Yolov3 33.5 / 33.5 33.6 / 33.6 34.1 / 34.1
Retina-fpn50 37.4 / 37.7 37.5 / 37.9 38.1 / 38.4
FCOS-X101 42.7 / 42.8 42.7 / 42.8 42.9 / 43.1
AutoAssign-fpn50 40.4 / 40.6 40.5 / 40.7 41.0 / 41.2

Yolov5(v6 model) on COCO val

Model NMS Soft-NMS CP-Cluster
Yolov5s 37.2 37.4 37.5
Yolov5m 45.2 45.3 45.5
Yolov5l 48.8 48.8 49.1
Yolov5x 50.7 50.8 51.0
Yolov5s_1280 44.5 50.8 44.8
Yolov5m_1280 51.1 51.1 51.3
Yolov5l_1280 53.6 53.7 53.8
Yolov5x_1280 54.7 54.8 55.0

Replace maxpooling with CP-Cluster for Centernet(Evaluated on COCO test-dev), where "flip_scale" means flip and multi-scale augmentations

Model maxpool Soft-NMS CP-Cluster
dla34 37.3 38.1 39.2
dla34_flip_scale 41.7 40.6 43.3
hg_104 40.2 40.6 41.1
hg_104_flip_scale 45.2 44.3 46.6

Instance Segmentation(MASK-RCNN, 3X models) from MMDetection on COCO test-dev

Box/Mask AP NMS Soft-NMS CP-Cluster
MRCNN_R50 41.5/37.7 42.0/37.8 42.1/38.0
MRCNN_R101 43.1/38.8 43.6/39.0 43.6/39.1
MRCNN_X101 44.6/40.0 45.2/40.2 45.2/40.2

Integrate into MMCV

Clone the mmcv repo from https://github.com/shenyi0220/mmcv (Cut down by 9/28/2021 from main branch with no extra modifications)

Copy the implementation of "cp_cluster_cpu" in src/nms.cpp to the mmcv nms code("mmcv/ops/csrc/pytorch/nms.cpp")

Borrow the "soft_nms_cpu" API by calling "cp_cluster_cpu" rather than orignal Soft-NMS implementations, so that modify the code like below:

@@ -186,8 +186,8 @@ Tensor softnms(Tensor boxes, Tensor scores, Tensor dets, float iou_threshold,
   if (boxes.device().is_cuda()) {
     AT_ERROR("softnms is not implemented on GPU");
   } else {
-    return softnms_cpu(boxes, scores, dets, iou_threshold, sigma, min_score,
-                       method, offset);
+    return cp_cluster_cpu(boxes, scores, dets, iou_threshold, min_score,
+                          offset, 0.8, 3);
   }
 }

Compile mmcv with source code

MMCV_WITH_OPS=1 pip install -e .

Reproduce Object Detection and Instance Segmentation in MMDetection

Make sure that the MMCV with CP-Cluster has been successfully installed.

Download code from https://github.com/shenyi0220/mmdetection (Cut down by 9/26/2021 from main branch with some config file modifications to call Soft-NMS/CP-Cluster), and install all the dependancies accordingly.

Download models from model zoo

Run below command to reproduce Faster-RCNN-r50-fpn-2x:

python tools/test.py ./configs/faster_rcnn/faster_rcnn_r50_fpn_2x_coco.py ./checkpoints/faster_rcnn_r50_fpn_2x_coco_bbox_mAP-0.384_20200504_210434-a5d8aa15.pth --eval bbox

To check original metrics with NMS, you can switch the model config back to use default NMS.

To check Soft-NMS metrics, just re-compile with mmcv without CP-Cluster modifications.

Reproduce Yolov5

Make sure that the MMCV with CP-Cluster has been successfully installed.

Download code from https://github.com/shenyi0220/yolov5 (Cut down by 11/9/2021 from main branch, replacing the default torchvision.nms with CP-Cluster from mmcv), and install all the dependancies accordingly.

Run below command to reproduce the CP-Cluster exp with yolov5s-v6

python val.py --data coco.yaml --conf 0.001 --iou 0.6 --weights yolov5s.pt --batch-size 32

License

For the time being, this implementation is published with NVIDIA proprietary license, and the only usage of the source code is to reproduce the experiments of CP-Cluster. For any possible commercial use and redistribution of the code, pls contact [email protected]

Citation

If you find this project useful for your research, please use the following BibTeX entry.

Owner
Yichun Shen
Yichun Shen
This is a simple item2vec implementation using gensim for recbole

recbole-item2vec-model This is a simple item2vec implementation using gensim for recbole( https://recbole.io ) Usage When you want to run experiment f

Yusuke Fukasawa 2 Oct 06, 2022
State of the art faster Natural Language Processing in Tensorflow 2.0 .

tf-transformers: faster and easier state-of-the-art NLP in TensorFlow 2.0 ****************************************************************************

74 Dec 05, 2022
Easy to start. Use deep nerual network to predict the sentiment of movie review.

Easy to start. Use deep nerual network to predict the sentiment of movie review. Various methods, word2vec, tf-idf and df to generate text vectors. Various models including lstm and cov1d. Achieve f1

1 Nov 19, 2021
An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"

The implementation of paper CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval. CLIP4Clip is a video-text retrieval model based

ArrowLuo 456 Jan 06, 2023
This repository contains examples of Task-Informed Meta-Learning

Task-Informed Meta-Learning This repository contains examples of Task-Informed Meta-Learning (paper). We consider two tasks: Crop Type Classification

10 Dec 19, 2022
Code for ACL 2020 paper "Rigid Formats Controlled Text Generation"

SongNet SongNet: SongCi + Song (Lyrics) + Sonnet + etc. @inproceedings{li-etal-2020-rigid, title = "Rigid Formats Controlled Text Generation",

Piji Li 212 Dec 17, 2022
Vad-sli-asr - A Python scripts for a speech processing pipeline with Voice Activity Detection (VAD)

VAD-SLI-ASR Python scripts for a speech processing pipeline with Voice Activity

Dynamics of Language 14 Dec 09, 2022
Chinese Named Entity Recognization (BiLSTM with PyTorch)

BiLSTM-CRF for Name Entity Recognition PyTorch version A PyTorch implemention of Bi-LSTM-CRF model for Chinese Named Entity Recognition. 使用 PyTorch 实现

5 Jun 01, 2022
Converts text into a PDF of handwritten notes

Text To Handwritten Notes Converts text into a PDF of handwritten notes Explore the docs » · Report Bug · Request Feature · Steps: $ git clone https:/

UVSinghK 63 Oct 09, 2022
Simple translation demo showcasing our headliner package.

Headliner Demo This is a demo showcasing our Headliner package. In particular, we trained a simple seq2seq model on an English-German dataset. We didn

Axel Springer News Media & Tech GmbH & Co. KG - Ideas Engineering 16 Nov 24, 2022
Spert NLP Relation Extraction API deployed with torchserve for inference

URLMask Python program for Linux users to change a URL to ANY domain. A program than can take any url and mask it to any domain name you like. E.g. ne

Zichu Chen 1 Nov 24, 2021
Code for paper Multitask-Finetuning of Zero-shot Vision-Language Models

Code for paper Multitask-Finetuning of Zero-shot Vision-Language Models

Zhenhailong Wang 2 Jul 15, 2022
Auto_code_complete is a auto word-completetion program which allows you to customize it on your needs

auto_code_complete is a auto word-completetion program which allows you to customize it on your needs. the model for this program is one of the deep-learning NLP(Natural Language Process) model struc

RUO 2 Feb 22, 2022
DeepPavlov Tutorials

DeepPavlov tutorials DeepPavlov: Sentence Classification with Word Embeddings DeepPavlov: Transfer Learning with BERT. Classification, Tagging, QA, Ze

Neural Networks and Deep Learning lab, MIPT 28 Sep 13, 2022
Source code for the paper "TearingNet: Point Cloud Autoencoder to Learn Topology-Friendly Representations"

TearingNet: Point Cloud Autoencoder to Learn Topology-Friendly Representations Created by Jiahao Pang, Duanshun Li, and Dong Tian from InterDigital In

InterDigital 21 Dec 29, 2022
Label data using HuggingFace's transformers and automatically get a prediction service

Label Studio for Hugging Face's Transformers Website • Docs • Twitter • Join Slack Community Transfer learning for NLP models by annotating your textu

Heartex 135 Dec 29, 2022
Korean Sentence Embedding Repository

Korean-Sentence-Embedding 🍭 Korean sentence embedding repository. You can download the pre-trained models and inference right away, also it provides

80 Jan 02, 2023
Generating new names based on trends in data using GPT2 (Transformer network)

MLOpsNameGenerator Overall Goal The goal of the project is to develop a model that is capable of creating Pokémon names based on its description, usin

Gustav Lang Moesmand 2 Jan 10, 2022
HiFi DeepVariant + WhatsHap workflowHiFi DeepVariant + WhatsHap workflow

HiFi DeepVariant + WhatsHap workflow Workflow steps align HiFi reads to reference with pbmm2 call small variants with DeepVariant, using two-pass meth

William Rowell 2 May 14, 2022