MutualGuide is a compact object detector specially designed for embedded devices

Last update: Dec 13, 2022

Overview

Introduction

MutualGuide is a compact object detector specially designed for embedded devices. Comparing to existing detectors, this repo contains two key features.

Firstly, the Mutual Guidance mecanism assigns labels to the classification task based on the prediction on the localization task, and vice versa, alleviating the misalignment problem between both tasks; Secondly, the teacher-student prediction disagreements guides the knowledge transfer in a feature-based detection distillation framework, thereby reducing the performance gap between both models.

For more details, please refer to our ACCV paper and BMVC paper.

Planning

Add RepVGG backbone.
Add ShuffleNetV2 backbone.
Add TensorRT transform code for inference acceleration.
Add draw function to plot detection results.
Add custom dataset training (annotations in XML format).
Add Transformer backbone.
Add BiFPN neck.

Benchmark

Without knowledge distillation:

Backbone	Resolution	AP^val 0.5:0.95	AP^val 0.5	AP^val 0.75	AP^val small	AP^val medium	AP^val large	Speed V100 (ms)	Weights
ShuffleNet-1.0	512x512	35.8	52.9	38.6	19.8	40.1	48.3	8.3	Google
ResNet-34	512x512	44.1	62.3	47.6	26.5	50.2	58.3	6.9	Google
ResNet-18	512x512	42.0	60.0	45.3	25.4	47.1	56.0	4.4	Google
RepVGG-A2	512x512	44.2	62.5	47.5	27.2	50.3	57.2	5.3	Google
RepVGG-A1	512x512	43.1	61.3	46.6	26.6	49.3	55.9	4.4	Google

With knowledge distillation:

Backbone	Resolution	AP^val 0.5:0.95	AP^val 0.5	AP^val 0.75	AP^val small	AP^val medium	AP^val large	Speed V100 (ms)	Weights
ResNet-18	512x512	42.9	60.7	46.2	25.4	48.8	57.2	4.4	Google
RepVGG-A1	512x512	44.0	62.1	47.3	27.6	49.9	57.9	4.4	Google

Remarks:

The precision is measured on the COCO2017 Val dataset.
The inference runtime is measured by Pytorch framework (without TensorRT acceleration) on a Tesla V100 GPU, and the post-processing time (e.g., NMS) is not included (i.e., we measure the model inference time).
To dowload from Baidu cloud, go to this link (password: dvz7).

Datasets

First download the VOC and COCO dataset, you may find the sripts in data/scripts/ helpful. Then create a folder named datasets and link the downloaded datasets inside:

$ mkdir datasets
$ ln -s /path_to_your_voc_dataset datasets/VOCdevkit
$ ln -s /path_to_your_coco_dataset datasets/coco2017

Remarks:

For training on custom dataset, first modify the dataset path XMLroot and categories XML_CLASSES in data/xml_dataset.py. Then apply --dataset XML.

Training

For training with Mutual Guide:

$ python3 train.py --neck ssd --backbone vgg16    --dataset VOC --size 320 --multi_level --multi_anchor --mutual_guide --pretrained
                          fpn            resnet34           COCO       512
                          pafpn          repvgg-A2          XML
                                         shufflenet-1.0

For knowledge distillation using PDF-Distil:

$ python3 distil.py --neck ssd --backbone vgg11    --dataset VOC --size 320 --multi_level --multi_anchor --mutual_guide --pretrained --kd pdf
                           fpn            resnet18           COCO       512
                           pafpn          repvgg-A1          XML
                                          shufflenet-0.5

Remarks:

For training without MutualGuide, just remove the --mutual_guide;
For training on custom dataset, convert your annotations into XML format and use the parameter --dataset XML. An example is given in datasets/XML/;
For knowledge distillation with traditional MSE loss, just use parameter --kd mse;
The default folder to save trained model is weights/.

Evaluation

Every time you want to evaluate a trained network:

$ python3 test.py --neck ssd --backbone vgg11    --dataset VOC --size 320 --trained_model path_to_saved_weights --multi_level --multi_anchor --pretrained --draw
                         fpn            resnet18           COCO       512
                         pafpn          repvgg-A1          XML
                                        shufflenet-0.5

Remarks:

It will directly print the mAP, AP50 and AP50 results on VOC2007 Test or COCO2017 Val;
Add parameter --draw to draw detection results. They will be saved in draw/VOC/ or draw/COCO/ or draw/XML/;
Add --trt to activate TensorRT acceleration.

Citing us

Please cite our papers in your publications if they help your research:

@InProceedings{Zhang_2020_ACCV,
    author    = {Zhang, Heng and Fromont, Elisa and Lefevre, Sebastien and Avignon, Bruno},
    title     = {Localize to Classify and Classify to Localize: Mutual Guidance in Object Detection},
    booktitle = {Proceedings of the Asian Conference on Computer Vision (ACCV)},
    month     = {November},
    year      = {2020}
}

@InProceedings{Zhang_2021_BMVC,
    author    = {Zhang, Heng and Fromont, Elisa and Lefevre, Sebastien and Avignon, Bruno},
    title     = {PDF-Distil: including Prediction Disagreements in Feature-based Distillation for object detection},
    booktitle = {Proceedings of the British Machine Vision Conference (BMVC)},
    month     = {November},
    year      = {2021}
}

Acknowledgement

This project contains pieces of code from the following projects: mmdetection, ssd.pytorch, rfbnet and yolox.

MutualGuide is a compact object detector specially designed for embedded devices

Related tags

Overview

Introduction

Planning

Benchmark

Datasets

Training

Evaluation

Citing us

Acknowledgement

Owner

ZHANG Heng

Multimodal Temporal Context Network (MTCN)

PyTorch implementation of "Debiased Visual Question Answering from Feature and Sample Perspectives" (NeurIPS 2021)

Tutorial for the PERFECTING FACTORY 5.0 WITH EDGE-POWERED AI workshop

InterfaceGAN++: Exploring the limits of InterfaceGAN

Demystifying How Self-Supervised Features Improve Training from Noisy Labels

Repository for MeshTalk supplemental material and code once the (already approved) 16 GHS captures our lab will make publicly available are released.

Retrieve and analysis data from SDSS (Sloan Digital Sky Survey)

Official implementation of Few-Shot and Continual Learning with Attentive Independent Mechanisms

This is the official implementation for the paper "(Almost) Free Incentivized Exploration from Decentralized Learning Agents" in NeurIPS 2021.

PyTorchCV: A PyTorch-Based Framework for Deep Learning in Computer Vision.

Flexible Option Learning - NeurIPS 2021

[v1 (ISBI'21) + v2] MedMNIST: A Large-Scale Lightweight Benchmark for 2D and 3D Biomedical Image Classification

Multi-Content GAN for Few-Shot Font Style Transfer at CVPR 2018

Qlib is an AI-oriented quantitative investment platform

Official code repository for the EMNLP 2021 paper

pytorch implementation of GPV-Pose

Simple torch.nn.module implementation of Alias-Free-GAN style filter and resample

Implementation of 'lightweight' GAN, proposed in ICLR 2021, in Pytorch. High resolution image generations that can be trained within a day or two

The source code of the ICCV2021 paper "PIRenderer: Controllable Portrait Image Generation via Semantic Neural Rendering"

[NeurIPS'21] "AugMax: Adversarial Composition of Random Augmentations for Robust Training" by Haotao Wang, Chaowei Xiao, Jean Kossaifi, Zhiding Yu, Animashree Anandkumar, and Zhangyang Wang.