Instance-conditional Knowledge Distillation for Object Detection

Related tags

Deep LearningICD
Overview

Instance-conditional Knowledge Distillation for Object Detection

This is a MegEngine implementation of the paper "Instance-conditional Knowledge Distillation for Object Detection", based on MegEngine Models.

The pytorch implementation based on detectron2 will be released soon.

Instance-Conditional Knowledge Distillation for Object Detection,
Zijian Kang, Peizhen Zhang, Xiangyu Zhang, Jian Sun, Nanning Zheng
In: Proc. Advances in Neural Information Processing Systems (NeurIPS), 2021
[arXiv]

Requirements

Installation

In order to run the code, please prepare a CUDA environment with:

  1. Install dependancies.
pip3 install --upgrade pip
pip3 install -r requirements.txt
  1. Prepare MS-COCO 2017 dataset,put it to a proper directory with the following structures:
/path/to/
    |->coco
    |    |annotations
    |    |train2017
    |    |val2017

Microsoft COCO: Common Objects in Context Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. European Conference on Computer Vision (ECCV), 2014.

Usage

Train baseline models

Following MegEngine Models:

python3 train.py -f distill_configs/retinanet_res50_coco_1x_800size.py -n 8 \
                       -d /data/Datasets

train.py arguments:

  • -f, config file for the network.
  • -n, required devices(gpu).
  • -w, pretrained backbone weights.
  • -b, training batch size, default is 2.
  • -d, dataset root,default is /data/datasets.

Train with distillation

python3 train_distill_icd.py -f distill_configs/retinanet_res50_coco_1x_800size.py \ 
    -n 8 -l -d /data/Datasets -tf configs/retinanet_res101_coco_3x_800size.py \
    -df distill_configs/ICD.py \
    -tw _model_zoo/retinanet_res101_coco_3x_800size_41dot4_73b01887.pkl

train_distill_icd.py arguments:

  • -f, config file for the student network.
  • -w, pretrained backbone weights.
  • -tf, config file for the teacher network.
  • -tw, pretrained weights for the teacher.
  • -df, config file for the distillation module, distill_configs/ICD.py by default.
  • -l, use the inheriting strategy, load pretrained parameters.
  • -n, required devices(gpu).
  • -b, training batch size, default is 2.
  • -d, dataset root,default is /data/datasets.

Note that we set backbone_pretrained in distill configs, where backbone weights will be loaded automatically, that -w can be omitted. Checkpoints will be saved to a log-xxx directory.

Evaluate

python3 test.py -f distill_configs/retinanet_res50_coco_3x_800size.py -n 8 \
     -w log-of-xxx/epoch_17.pkl -d /data/Datasets/

test.py arguments:

  • -f, config file for the network.
  • -n, required devices(gpu).
  • -w, pretrained weights.
  • -d, dataset root,default is /data/datasets.

Examples and Results

Steps

  1. Download the pretrained teacher model to _model_zoo directory.
  2. Train baseline or distill with ICD.
  3. Evaluate checkpoints (use the last checkpoint by default).

Example of Common Detectors

RetinaNet

Command:

python3 train_distill_icd.py -f distill_configs/retinanet_res50_coco_1x_800size.py \
    -n 8 -l -d /data/Datasets -tf configs/retinanet_res101_coco_3x_800size.py \
    -df distill_configs/ICD.py \
    -tw _model_zoo/retinanet_res101_coco_3x_800size_41dot4_73b01887.pkl

FCOS

Command:

python3 train_distill_icd.py -f distill_configs/fcos_res50_coco_1x_800size.py \
    -n 8 -l -d /data/Datasets -tf configs/fcos_res101_coco_3x_800size.py \
    -df distill_configs/ICD.py \
    -tw _model_zoo/fcos_res101_coco_3x_800size_44dot3_f38e8df1.pkl

ATSS

Command:

python3 train_distill_icd.py -f distill_configs/atss_res50_coco_1x_800size.py \
    -n 8 -l -d /data/Datasets -tf configs/atss_res101_coco_3x_800size.py \
    -df distill_configs/ICD.py \
    -tw _model_zoo/atss_res101_coco_3x_800size_44dot7_9181687e.pkl

Results of AP in MS-COCO:

Model Baseline +ICD
Retinanet 36.8 40.3
FCOS 40.0 43.3
ATSS 39.6 43.0

Notice

  • Results of this implementation are mainly for demonstration, please refer to the Detectron2 version for reproduction.

  • We simply adopt the hyperparameter from Detectron2 version, further tunning could be helpful.

  • There is a known CUDA memory issue related to MegEngine: the actual memory consumption will be much larger than the theoretical value, due to the memory fragmentation. This is expected to be fixed in a future version of MegEngine.

Acknowledgement

This repo is modified from MegEngine Models. We also refer to Pytorch, DETR and Detectron2 for some implementations.

License

This repo is licensed under the Apache License, Version 2.0 (the "License").

Citation

@inproceedings{kang2021icd,
    title={Instance-conditional Distillation for Object Detection},
    author={Zijian Kang, Peizhen Zhang, Xiangyu Zhang, Jian Sun, Nanning Zheng},
    year={2021},
    booktitle={NeurIPS},
}
Owner
MEGVII Research
Power Human with AI. 持续创新拓展认知边界 非凡科技成就产品价值
MEGVII Research
A Distributional Approach To Controlled Text Generation

A Distributional Approach To Controlled Text Generation This is the repository code for the ICLR 2021 paper "A Distributional Approach to Controlled T

NAVER 102 Jan 07, 2023
It is a simple library to speed up CLIP inference up to 3x (K80 GPU)

CLIP-ONNX It is a simple library to speed up CLIP inference up to 3x (K80 GPU) Usage Install clip-onnx module and requirements first. Use this trick !

Gerasimov Maxim 93 Dec 20, 2022
Official PyTorch implementation of "BlendGAN: Implicitly GAN Blending for Arbitrary Stylized Face Generation" (NeurIPS 2021)

BlendGAN: Implicitly GAN Blending for Arbitrary Stylized Face Generation Official PyTorch implementation of the NeurIPS 2021 paper Mingcong Liu, Qiang

onion 462 Dec 29, 2022
PyTorch implementation of the REMIND method from our ECCV-2020 paper "REMIND Your Neural Network to Prevent Catastrophic Forgetting"

REMIND Your Neural Network to Prevent Catastrophic Forgetting This is a PyTorch implementation of the REMIND algorithm from our ECCV-2020 paper. An ar

Tyler Hayes 72 Nov 27, 2022
Do Smart Glasses Dream of Sentimental Visions? Deep Emotionship Analysis for Eyewear Devices

EMOShip This repository contains the EMO-Film dataset described in the paper "Do Smart Glasses Dream of Sentimental Visions? Deep Emotionship Analysis

1 Nov 18, 2022
Examples of using f2py to get high-speed Fortran integrated with Python easily

f2py Examples Simple examples of using f2py to get high-speed Fortran integrated with Python easily. These examples are also useful to troubleshoot pr

Michael 35 Aug 21, 2022
[CVPR-2021] UnrealPerson: An adaptive pipeline for costless person re-identification

UnrealPerson: An Adaptive Pipeline for Costless Person Re-identification In our paper (arxiv), we propose a novel pipeline, UnrealPerson, that decreas

ZhangTianyu 70 Oct 10, 2022
Some code of the implements of Geological Modeling Using 3D Pixel-Adaptive and Deformable Convolutional Neural Network

3D-GMPDCNN Geological Modeling Using 3D Pixel-Adaptive and Deformable Convolutional Neural Network PyTorch implementation of "Geological Modeling Usin

5 Nov 21, 2022
Deeper insights into graph convolutional networks for semi-supervised learning

deeper_insights_into_GCNs Deeper insights into graph convolutional networks for semi-supervised learning References data and utils.py come from Implem

Davidham3 17 Dec 16, 2022
PICK: Processing Key Information Extraction from Documents using Improved Graph Learning-Convolutional Networks

Code for the paper "PICK: Processing Key Information Extraction from Documents using Improved Graph Learning-Convolutional Networks" (ICPR 2020)

Wenwen Yu 498 Dec 24, 2022
Pytorch implementation of Depth-conditioned Dynamic Message Propagation forMonocular 3D Object Detection

DDMP-3D Pytorch implementation of Depth-conditioned Dynamic Message Propagation forMonocular 3D Object Detection, a paper on CVPR2021. Instroduction T

Li Wang 32 Nov 09, 2022
JumpDiff: Non-parametric estimator for Jump-diffusion processes for Python

jumpdiff jumpdiff is a python library with non-parametric Nadaraya─Watson estimators to extract the parameters of jump-diffusion processes. With jumpd

Rydin 28 Dec 10, 2022
Graph-based community clustering approach to extract protein domains from a predicted aligned error matrix

Using a predicted aligned error matrix corresponding to an AlphaFold2 model , returns a series of lists of residue indices, where each list corresponds to a set of residues clustering together into a

Tristan Croll 24 Nov 23, 2022
OpenMMLab Pose Estimation Toolbox and Benchmark.

Introduction English | 简体中文 MMPose is an open-source toolbox for pose estimation based on PyTorch. It is a part of the OpenMMLab project. The master b

OpenMMLab 2.8k Dec 31, 2022
PyQt6 configuration in yaml format providing the most simple script.

PyamlQt(ぴゃむるきゅーと) PyQt6 configuration in yaml format providing the most simple script. Requirements yaml PyQt6, ( PyQt5 ) Installation pip install Pya

Ar-Ray 7 Aug 15, 2022
Program your own vulkan.gpuinfo.org query in Python. Used to determine baseline hardware for WebGPU.

query-gpuinfo-data License This software is not presently released under a license. The data in data/ is obtained under CC BY 4.0 as specified there.

Kai Ninomiya 5 Jul 18, 2022
Crowd-sourced Annotation of Human Motion.

Motion Annotation Tool Live: https://motion-annotation.humanoids.kit.edu Paper: The KIT Motion-Language Dataset Installation Start by installing all P

Matthias Plappert 4 May 25, 2020
Syllabic Quantity Patterns as Rhythmic Features for Latin Authorship Attribution

Syllabic Quantity Patterns as Rhythmic Features for Latin Authorship Attribution Abstract Within the Latin (and ancient Greek) production, it is well

4 Dec 03, 2022
Pytorch implementation of the paper "Optimization as a Model for Few-Shot Learning"

Optimization as a Model for Few-Shot Learning This repo provides a Pytorch implementation for the Optimization as a Model for Few-Shot Learning paper.

Albert Berenguel Centeno 238 Jan 04, 2023
The world's simplest facial recognition api for Python and the command line

Face Recognition You can also read a translated version of this file in Chinese 简体中文版 or in Korean 한국어 or in Japanese 日本語. Recognize and manipulate fa

Adam Geitgey 46.9k Jan 03, 2023