Instance-conditional Knowledge Distillation for Object Detection

Related tags

Deep LearningICD
Overview

Instance-conditional Knowledge Distillation for Object Detection

This is a MegEngine implementation of the paper "Instance-conditional Knowledge Distillation for Object Detection", based on MegEngine Models.

The pytorch implementation based on detectron2 will be released soon.

Instance-Conditional Knowledge Distillation for Object Detection,
Zijian Kang, Peizhen Zhang, Xiangyu Zhang, Jian Sun, Nanning Zheng
In: Proc. Advances in Neural Information Processing Systems (NeurIPS), 2021
[arXiv]

Requirements

Installation

In order to run the code, please prepare a CUDA environment with:

  1. Install dependancies.
pip3 install --upgrade pip
pip3 install -r requirements.txt
  1. Prepare MS-COCO 2017 dataset,put it to a proper directory with the following structures:
/path/to/
    |->coco
    |    |annotations
    |    |train2017
    |    |val2017

Microsoft COCO: Common Objects in Context Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. European Conference on Computer Vision (ECCV), 2014.

Usage

Train baseline models

Following MegEngine Models:

python3 train.py -f distill_configs/retinanet_res50_coco_1x_800size.py -n 8 \
                       -d /data/Datasets

train.py arguments:

  • -f, config file for the network.
  • -n, required devices(gpu).
  • -w, pretrained backbone weights.
  • -b, training batch size, default is 2.
  • -d, dataset root,default is /data/datasets.

Train with distillation

python3 train_distill_icd.py -f distill_configs/retinanet_res50_coco_1x_800size.py \ 
    -n 8 -l -d /data/Datasets -tf configs/retinanet_res101_coco_3x_800size.py \
    -df distill_configs/ICD.py \
    -tw _model_zoo/retinanet_res101_coco_3x_800size_41dot4_73b01887.pkl

train_distill_icd.py arguments:

  • -f, config file for the student network.
  • -w, pretrained backbone weights.
  • -tf, config file for the teacher network.
  • -tw, pretrained weights for the teacher.
  • -df, config file for the distillation module, distill_configs/ICD.py by default.
  • -l, use the inheriting strategy, load pretrained parameters.
  • -n, required devices(gpu).
  • -b, training batch size, default is 2.
  • -d, dataset root,default is /data/datasets.

Note that we set backbone_pretrained in distill configs, where backbone weights will be loaded automatically, that -w can be omitted. Checkpoints will be saved to a log-xxx directory.

Evaluate

python3 test.py -f distill_configs/retinanet_res50_coco_3x_800size.py -n 8 \
     -w log-of-xxx/epoch_17.pkl -d /data/Datasets/

test.py arguments:

  • -f, config file for the network.
  • -n, required devices(gpu).
  • -w, pretrained weights.
  • -d, dataset root,default is /data/datasets.

Examples and Results

Steps

  1. Download the pretrained teacher model to _model_zoo directory.
  2. Train baseline or distill with ICD.
  3. Evaluate checkpoints (use the last checkpoint by default).

Example of Common Detectors

RetinaNet

Command:

python3 train_distill_icd.py -f distill_configs/retinanet_res50_coco_1x_800size.py \
    -n 8 -l -d /data/Datasets -tf configs/retinanet_res101_coco_3x_800size.py \
    -df distill_configs/ICD.py \
    -tw _model_zoo/retinanet_res101_coco_3x_800size_41dot4_73b01887.pkl

FCOS

Command:

python3 train_distill_icd.py -f distill_configs/fcos_res50_coco_1x_800size.py \
    -n 8 -l -d /data/Datasets -tf configs/fcos_res101_coco_3x_800size.py \
    -df distill_configs/ICD.py \
    -tw _model_zoo/fcos_res101_coco_3x_800size_44dot3_f38e8df1.pkl

ATSS

Command:

python3 train_distill_icd.py -f distill_configs/atss_res50_coco_1x_800size.py \
    -n 8 -l -d /data/Datasets -tf configs/atss_res101_coco_3x_800size.py \
    -df distill_configs/ICD.py \
    -tw _model_zoo/atss_res101_coco_3x_800size_44dot7_9181687e.pkl

Results of AP in MS-COCO:

Model Baseline +ICD
Retinanet 36.8 40.3
FCOS 40.0 43.3
ATSS 39.6 43.0

Notice

  • Results of this implementation are mainly for demonstration, please refer to the Detectron2 version for reproduction.

  • We simply adopt the hyperparameter from Detectron2 version, further tunning could be helpful.

  • There is a known CUDA memory issue related to MegEngine: the actual memory consumption will be much larger than the theoretical value, due to the memory fragmentation. This is expected to be fixed in a future version of MegEngine.

Acknowledgement

This repo is modified from MegEngine Models. We also refer to Pytorch, DETR and Detectron2 for some implementations.

License

This repo is licensed under the Apache License, Version 2.0 (the "License").

Citation

@inproceedings{kang2021icd,
    title={Instance-conditional Distillation for Object Detection},
    author={Zijian Kang, Peizhen Zhang, Xiangyu Zhang, Jian Sun, Nanning Zheng},
    year={2021},
    booktitle={NeurIPS},
}
Owner
MEGVII Research
Power Human with AI. 持续创新拓展认知边界 非凡科技成就产品价值
MEGVII Research
A large-scale video dataset for the training and evaluation of 3D human pose estimation models

ASPset-510 (Australian Sports Pose Dataset) is a large-scale video dataset for the training and evaluation of 3D human pose estimation models. It contains 17 different amateur subjects performing 30

Aiden Nibali 25 Jun 20, 2021
For AILAB: Cross Lingual Retrieval on Yelp Search Engine

Cross-lingual Information Retrieval Model for Document Search Train Phase CUDA_VISIBLE_DEVICES="0,1,2,3" \ python -m torch.distributed.launch --nproc_

Chilia Waterhouse 104 Nov 12, 2022
PyTorch code for our paper "Attention in Attention Network for Image Super-Resolution"

Under construction... Attention in Attention Network for Image Super-Resolution (A2N) This repository is an PyTorch implementation of the paper "Atten

Haoyu Chen 71 Dec 30, 2022
MagFace: A Universal Representation for Face Recognition and Quality Assessment

MagFace MagFace: A Universal Representation for Face Recognition and Quality Assessment in IEEE Conference on Computer Vision and Pattern Recognition

Qiang Meng 523 Jan 05, 2023
The Fundamental Clustering Problems Suite (FCPS) summaries 54 state-of-the-art clustering algorithms, common cluster challenges and estimations of the number of clusters as well as the testing for cluster tendency.

FCPS Fundamental Clustering Problems Suite The package provides over sixty state-of-the-art clustering algorithms for unsupervised machine learning pu

9 Nov 27, 2022
[ICML 2021] Towards Understanding and Mitigating Social Biases in Language Models

Towards Understanding and Mitigating Social Biases in Language Models This repo contains code and data for evaluating and mitigating bias from generat

Paul Liang 42 Jan 03, 2023
Reduce end to end training time from days to hours (or hours to minutes), and energy requirements/costs by an order of magnitude using coresets and data selection.

COResets and Data Subset selection Reduce end to end training time from days to hours (or hours to minutes), and energy requirements/costs by an order

decile-team 244 Jan 09, 2023
General purpose Slater-Koster tight-binding code for electronic structure calculations

tight-binder Introduction General purpose tight-binding code for electronic structure calculations based on the Slater-Koster approximation. The code

9 Dec 15, 2022
On-device speech-to-intent engine powered by deep learning

Rhino Made in Vancouver, Canada by Picovoice Rhino is Picovoice's Speech-to-Intent engine. It directly infers intent from spoken commands within a giv

Picovoice 510 Dec 30, 2022
Unified Instance and Knowledge Alignment Pretraining for Aspect-based Sentiment Analysis

Unified Instance and Knowledge Alignment Pretraining for Aspect-based Sentiment Analysis Requirements python 3.7 pytorch-gpu 1.7 numpy 1.19.4 pytorch_

12 Oct 29, 2022
Python implementation of NARS (Non-Axiomatic-Reasoning-System)

Python implementation of NARS (Non-Axiomatic-Reasoning-System)

Bowen XU 11 Dec 20, 2022
Codes for “A Deeply Supervised Attention Metric-Based Network and an Open Aerial Image Dataset for Remote Sensing Change Detection”

DSAMNet The pytorch implementation for "A Deeply-supervised Attention Metric-based Network and an Open Aerial Image Dataset for Remote Sensing Change

Mengxi Liu 41 Dec 14, 2022
Simple image captioning model - CLIP prefix captioning.

Simple image captioning model - CLIP prefix captioning.

688 Jan 04, 2023
Hyperopt for solving CIFAR-100 with a convolutional neural network (CNN) built with Keras and TensorFlow, GPU backend

Hyperopt for solving CIFAR-100 with a convolutional neural network (CNN) built with Keras and TensorFlow, GPU backend This project acts as both a tuto

Guillaume Chevalier 103 Jul 22, 2022
Modeling Category-Selective Cortical Regions with Topographic Variational Autoencoders

Modeling Category-Selective Cortical Regions with Topographic Variational Autoencoders

1 Oct 11, 2021
ZeroVL - The official implementation of ZeroVL

This repository contains source code necessary to reproduce the results presente

31 Nov 04, 2022
This library provides an abstraction to perform Model Versioning using Weight & Biases.

Description This library provides an abstraction to perform Model Versioning using Weight & Biases. Features Version a new trained model Promote a mod

Hector Lopez Almazan 2 Jan 28, 2022
Code repository for the work "Multi-Domain Incremental Learning for Semantic Segmentation", accepted at WACV 2022

Multi-Domain Incremental Learning for Semantic Segmentation This is the Pytorch implementation of our work "Multi-Domain Incremental Learning for Sema

Pgxo20 24 Jan 02, 2023
Convolutional Neural Network to detect deforestation in the Amazon Rainforest

Convolutional Neural Network to detect deforestation in the Amazon Rainforest This project is part of my final work as an Aerospace Engineering studen

5 Feb 17, 2022
Rendering Point Clouds with Compute Shaders

Compute Shader Based Point Cloud Rendering This repository contains the source code to our techreport: Rendering Point Clouds with Compute Shaders and

Markus Schütz 460 Jan 05, 2023