MutualGuide is a compact object detector specially designed for embedded devices

Overview

Introduction

MutualGuide is a compact object detector specially designed for embedded devices. Comparing to existing detectors, this repo contains two key features.

Firstly, the Mutual Guidance mecanism assigns labels to the classification task based on the prediction on the localization task, and vice versa, alleviating the misalignment problem between both tasks; Secondly, the teacher-student prediction disagreements guides the knowledge transfer in a feature-based detection distillation framework, thereby reducing the performance gap between both models.

For more details, please refer to our ACCV paper and BMVC paper.

Planning

  • Add RepVGG backbone.
  • Add ShuffleNetV2 backbone.
  • Add TensorRT transform code for inference acceleration.
  • Add draw function to plot detection results.
  • Add custom dataset training (annotations in XML format).
  • Add Transformer backbone.
  • Add BiFPN neck.

Benchmark

  • Without knowledge distillation:
Backbone Resolution APval
0.5:0.95
APval
0.5
APval
0.75
APval
small
APval
medium
APval
large
Speed V100
(ms)
Weights
ShuffleNet-1.0 512x512 35.8 52.9 38.6 19.8 40.1 48.3 8.3 Google
ResNet-34 512x512 44.1 62.3 47.6 26.5 50.2 58.3 6.9 Google
ResNet-18 512x512 42.0 60.0 45.3 25.4 47.1 56.0 4.4 Google
RepVGG-A2 512x512 44.2 62.5 47.5 27.2 50.3 57.2 5.3 Google
RepVGG-A1 512x512 43.1 61.3 46.6 26.6 49.3 55.9 4.4 Google
  • With knowledge distillation:
Backbone Resolution APval
0.5:0.95
APval
0.5
APval
0.75
APval
small
APval
medium
APval
large
Speed V100
(ms)
Weights
ResNet-18 512x512 42.9 60.7 46.2 25.4 48.8 57.2 4.4 Google
RepVGG-A1 512x512 44.0 62.1 47.3 27.6 49.9 57.9 4.4 Google

Remarks:

  • The precision is measured on the COCO2017 Val dataset.
  • The inference runtime is measured by Pytorch framework (without TensorRT acceleration) on a Tesla V100 GPU, and the post-processing time (e.g., NMS) is not included (i.e., we measure the model inference time).
  • To dowload from Baidu cloud, go to this link (password: dvz7).

Datasets

First download the VOC and COCO dataset, you may find the sripts in data/scripts/ helpful. Then create a folder named datasets and link the downloaded datasets inside:

$ mkdir datasets
$ ln -s /path_to_your_voc_dataset datasets/VOCdevkit
$ ln -s /path_to_your_coco_dataset datasets/coco2017

Remarks:

  • For training on custom dataset, first modify the dataset path XMLroot and categories XML_CLASSES in data/xml_dataset.py. Then apply --dataset XML.

Training

For training with Mutual Guide:

$ python3 train.py --neck ssd --backbone vgg16    --dataset VOC --size 320 --multi_level --multi_anchor --mutual_guide --pretrained
                          fpn            resnet34           COCO       512
                          pafpn          repvgg-A2          XML
                                         shufflenet-1.0

For knowledge distillation using PDF-Distil:

$ python3 distil.py --neck ssd --backbone vgg11    --dataset VOC --size 320 --multi_level --multi_anchor --mutual_guide --pretrained --kd pdf
                           fpn            resnet18           COCO       512
                           pafpn          repvgg-A1          XML
                                          shufflenet-0.5

Remarks:

  • For training without MutualGuide, just remove the --mutual_guide;
  • For training on custom dataset, convert your annotations into XML format and use the parameter --dataset XML. An example is given in datasets/XML/;
  • For knowledge distillation with traditional MSE loss, just use parameter --kd mse;
  • The default folder to save trained model is weights/.

Evaluation

Every time you want to evaluate a trained network:

$ python3 test.py --neck ssd --backbone vgg11    --dataset VOC --size 320 --trained_model path_to_saved_weights --multi_level --multi_anchor --pretrained --draw
                         fpn            resnet18           COCO       512
                         pafpn          repvgg-A1          XML
                                        shufflenet-0.5

Remarks:

  • It will directly print the mAP, AP50 and AP50 results on VOC2007 Test or COCO2017 Val;
  • Add parameter --draw to draw detection results. They will be saved in draw/VOC/ or draw/COCO/ or draw/XML/;
  • Add --trt to activate TensorRT acceleration.

Citing us

Please cite our papers in your publications if they help your research:

@InProceedings{Zhang_2020_ACCV,
    author    = {Zhang, Heng and Fromont, Elisa and Lefevre, Sebastien and Avignon, Bruno},
    title     = {Localize to Classify and Classify to Localize: Mutual Guidance in Object Detection},
    booktitle = {Proceedings of the Asian Conference on Computer Vision (ACCV)},
    month     = {November},
    year      = {2020}
}

@InProceedings{Zhang_2021_BMVC,
    author    = {Zhang, Heng and Fromont, Elisa and Lefevre, Sebastien and Avignon, Bruno},
    title     = {PDF-Distil: including Prediction Disagreements in Feature-based Distillation for object detection},
    booktitle = {Proceedings of the British Machine Vision Conference (BMVC)},
    month     = {November},
    year      = {2021}
}

Acknowledgement

This project contains pieces of code from the following projects: mmdetection, ssd.pytorch, rfbnet and yolox.

Implementation of the famous Image Manipulation\Forgery Detector "ManTraNet" in Pytorch

Who has never met a forged picture on the web ? No one ! Everyday we are constantly facing fake pictures touched up in Photoshop but it is not always

Rony Abecidan 77 Dec 16, 2022
Official code for ICCV2021 paper "M3D-VTON: A Monocular-to-3D Virtual Try-on Network"

M3D-VTON: A Monocular-to-3D Virtual Try-On Network Official code for ICCV2021 paper "M3D-VTON: A Monocular-to-3D Virtual Try-on Network" Paper | Suppl

109 Dec 29, 2022
Fermi Problems: A New Reasoning Challenge for AI

Fermi Problems: A New Reasoning Challenge for AI Fermi Problems are questions whose answer is a number that can only be reasonably estimated as a prec

AI2 15 May 28, 2022
Testability-Aware Low Power Controller Design with Evolutionary Learning, ITC2021

Testability-Aware Low Power Controller Design with Evolutionary Learning This repo contains the source code of Testability-Aware Low Power Controller

Lee Man 1 Dec 26, 2021
Yolact-keras实例分割模型在keras当中的实现

Yolact-keras实例分割模型在keras当中的实现 目录 性能情况 Performance 所需环境 Environment 文件下载 Download 训练步骤 How2train 预测步骤 How2predict 评估步骤 How2eval 参考资料 Reference 性能情况 训练数

Bubbliiiing 11 Dec 26, 2022
Official PyTorch implementation of "Synthesis of Screentone Patterns of Manga Characters"

Manga Character Screentone Synthesis Official PyTorch implementation of "Synthesis of Screentone Patterns of Manga Characters" presented in IEEE ISM 2

Tsubota 2 Nov 20, 2021
To build a regression model to predict the concrete compressive strength based on the different features in the training data.

Cement-Strength-Prediction Problem Statement To build a regression model to predict the concrete compressive strength based on the different features

Ashish Kumar 4 Jun 11, 2022
Pydantic models for pywttr and aiopywttr.

Pydantic models for pywttr and aiopywttr.

Almaz 2 Dec 08, 2022
Python Fanduel API (2021) - Lineup Automation

Southpaw is a python package that provides access to the Fanduel API. Optimize your DFS experience by programmatically updating your lineups, analyzin

Brandin Canfield 13 Jan 04, 2023
A certifiable defense against adversarial examples by training neural networks to be provably robust

DiffAI v3 DiffAI is a system for training neural networks to be provably robust and for proving that they are robust. The system was developed for the

SRI Lab, ETH Zurich 202 Dec 13, 2022
My Body is a Cage: the Role of Morphology in Graph-Based Incompatible Control

My Body is a Cage: the Role of Morphology in Graph-Based Incompatible Control

yobi byte 29 Oct 09, 2022
Implementation of Online Label Smoothing in PyTorch

Online Label Smoothing Pytorch implementation of Online Label Smoothing (OLS) presented in Delving Deep into Label Smoothing. Introduction As the abst

83 Dec 14, 2022
Voice of Pajlada with model and weights.

Pajlada TTS Stripped down version of ForwardTacotron (https://github.com/as-ideas/ForwardTacotron) with pretrained weights for Pajlada's (https://gith

6 Sep 03, 2021
Adaptive Dropblock Enhanced GenerativeAdversarial Networks for Hyperspectral Image Classification

This repo holds the codes of our paper: Adaptive Dropblock Enhanced GenerativeAdversarial Networks for Hyperspectral Image Classification, which is ac

Feng Gao 17 Dec 28, 2022
Codebase for Inducing Causal Structure for Interpretable Neural Networks

Interchange Intervention Training (IIT) Codebase for Inducing Causal Structure for Interpretable Neural Networks Release Notes 12/01/2021: Code and Pa

Zen 6 Oct 10, 2022
Official PyTorch code for Hierarchical Conditional Flow: A Unified Framework for Image Super-Resolution and Image Rescaling (HCFlow, ICCV2021)

Hierarchical Conditional Flow: A Unified Framework for Image Super-Resolution and Image Rescaling (HCFlow, ICCV2021) This repository is the official P

Jingyun Liang 159 Dec 30, 2022
Use evolutionary algorithms instead of gridsearch in scikit-learn

sklearn-deap Use evolutionary algorithms instead of gridsearch in scikit-learn. This allows you to reduce the time required to find the best parameter

rsteca 709 Jan 03, 2023
GMFlow: Learning Optical Flow via Global Matching

GMFlow GMFlow: Learning Optical Flow via Global Matching Authors: Haofei Xu, Jing Zhang, Jianfei Cai, Hamid Rezatofighi, Dacheng Tao We streamline the

Haofei Xu 298 Jan 04, 2023
9th place solution

AllDataAreExt-Galixir-Kaggle-HPA-2021-Solution Team Members Qishen Ha is Master of Engineering from the University of Tokyo. Machine Learning Engineer

daishu 5 Nov 18, 2021
It's a powerful version of linebot

CTPS-FINAL Linbot-sever.py 主程式 Algorithm.py 推薦演算法,媒合餐廳端資料與顧客端資料 config.ini 儲存 channel-access-token、channel-secret 資料 Preface 生活在成大將近4年,我們每天的午餐時間看著形形色色

1 Oct 17, 2022