Improving Object Detection by Label Assignment Distillation

Related tags

Deep LearningCoLAD
Overview

Improving Object Detection by Label Assignment Distillation

This is the official implementation of the WACV 2022 paper Improving Object Detection by Label Assignment Distillation. We provide the code for Label Assignement Distillation (LAD), training logs and several model checkpoints.

Table of Contents

  1. Introduction
  2. Installation
  3. Usage
  4. Experiments
  5. Citation

Introduction

This is the official repository for the paper Improving Object Detection by Label Assignment Distillation.

Soft Label Distillation concept (a) Label Assignment Distillation concept (b)
  • Distillation in Object Detection is typically achived by mimicking the teacher's output directly, such soft-label distillation or feature mimicking (Fig 1.a).
  • We propose the concept of Label Assignment Distillation (LAD), which solves the label assignment problems from distillation perspective, thus allowing the student learns from the teacher's knowledge without direct mimicking (Fig 1.b). LAD is very general, and applied to many dynamic label assignment methods. Following figure shows a concrete example of how to adopt Probabilistic Anchor Assignment (PAA) to LAD.
Probabilistic Anchor Assignment (PAA) Label Assignment Distillation (LAD) based on PAA
  • We demonstrate a number of advantages of LAD, notably that it is very simple and effective, flexible to use with most of detectors, and complementary to other distillation techniques.
  • Later, we introduced the Co-learning dynamic Label Assignment Distillation (CoLAD) to allow two networks to be trained mutually based on a dynamic switching criterion. We show that two networks trained with CoLAD are significantly better than if each was trained individually, given the same initialization.


Installation

  • Create environment:
conda create -n lad python=3.7 -y
conda activate lad
  • Install dependencies:
conda install pytorch=1.7.0 torchvision cudatoolkit=10.2 -c pytorch -y
pip install openmim future tensorboard sklearn timm==0.3.4
mim install mmcv-full==1.2.5
mim install mmdet==2.10.0
pip install -e ./

Usage

Train the model

#!/usr/bin/env bash
set -e
export GPUS=2
export CUDA_VISIBLE_DEVICES=0,2

CFG="configs/lad/paa_lad_r50_r101p1x_1x_coco.py"
WORKDIR="/checkpoints/lad/paa_lad_r50_r101p1x_1x_coco"

mim train mmdet $CFG --work-dir $WORKDIR \
    --gpus=$GPUS --launcher pytorch --seed 0 --deterministic

Test the model

#!/usr/bin/env bash
set -e
export GPUS=2
export CUDA_VISIBLE_DEVICES=0,2

CFG="configs/paa/paa_lad_r50_r101p1x_1x_coco.py"
CKPT="/checkpoints/lad/paa_lad_r50_r101p1x_1x_coco/epoch_12.pth"

mim test mmdet $CFG --checkpoint $CKPT --gpus $GPUS --launcher pytorch --eval bbox

Experiments

1. A Pilot Study - Preliminary Comparison.

Table 2: Compare the performance of the student PAA-R50 using Soft-Label, Label Assignment Distillation (LAD) and their combination (SoLAD) on COCO validation set.

Method Teacher Student gamma mAP Improve Config Download
Baseline None PAA-R50 (baseline) 2 40.4 - config model | log
Soft-Label-KL loss PAA-R101 PAA-R50 0.5 41.3 +0.9 config model | log
LAD (ours) PAA-R101 PAA-R50 2 41.6 +1.2 config model | log
SoLAD(ours) PAA-R101 PAA-R50 0.5 42.4 +2.0 config model | log

2. A Pilot Study - Does LAD need a bigger teacher network?

Table 3: Compare Soft-Label and Label Assignment Distillation (LAD) on COCO validation set. Teacher and student use ResNet50 and ResNet101 backbone, respectively. 2× denotes the 2× training schedule.

Method Teacher Student mAP Improve Config Download
Baseline None PAA-R50 40.4 - config model | log
Baseline (1x) None PAA-R101 42.6 - config model | log
Baseline (2x) None PAA-R101 43.5 +0.9 config model | log
Soft-Label PAA-R50 PAA-R101 40.4 -2.2 config model | log
LAD (our) PAA-R50 PAA-R101 43.3 +0.7 config model | log

3. Compare with State-ot-the-art Label Assignment methods

We use the PAA-R50 3x pretrained with multi-scale on COCO as the initial teacher. The teacher was evaluated with 43:3AP on the minval set. We train our network with the COP branch, and the post-processing steps are similar to PAA.

  • Table 7.1 Backbone ResNet-101, train 2x schedule, Multi-scale training. Results are evaluated on the COCO testset.
Method AP AP50 AP75 APs APm APl Download
FCOS 41.5 60.7 45.0 24.4 44.8 51.6
NoisyAnchor 41.8 61.1 44.9 23.4 44.9 52.9
FreeAnchor 43.1 62.2 46.4 24.5 46.1 54.8
SAPD 43.5 63.6 46.5 24.9 46.8 54.6
MAL 43.6 61.8 47.1 25.0 46.9 55.8
ATSS 43.6 62.1 47.4 26.1 47.0 53.6
AutoAssign 44.5 64.3 48.4 25.9 47.4 55.0
PAA 44.8 63.3 48.7 26.5 48.8 56.3
OTA 45.3 63.5 49.3 26.9 48.8 56.1
IQDet 45.1 63.4 49.3 26.7 48.5 56.6
CoLAD (ours) 46.0 64.4 50.6 27.9 49.9 57.3 config | model | log

4. Appendix - Ablation Study of Conditional Objectness Prediction (COP)

Table 1 - Appendix: Compare different auxiliary predictions: IoU, Implicit Object prediction (IOP), and Conditional Objectness prediction (COP), with ResNet-18 and ResNet-50 backbones. Experiments on COCO validate set.

IoU IOP COP ResNet-18 ResNet-50
✔️ 35.8 (config | model | log) 40.4 (config | model | log)
✔️ ✔️ 36.7 (config | model | log) 41.6 (config | model | log)
✔️ ✔️ 36.9 (config | model | log) 41.6 (config | model | log)
✔️ 36.6 (config | model | log) 41.1 (config | model | log)
✔️ 36.9 (config | model | log) 41.2 (config | model | log)

Citation

Please cite the paper in your publications if it helps your research:

@misc{nguyen2021improving,
      title={Improving Object Detection by Label Assignment Distillation}, 
      author={Chuong H. Nguyen and Thuy C. Nguyen and Tuan N. Tang and Nam L. H. Phan},
      year={2021},
      eprint={2108.10520},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}
  • On Sep 25 2021, we found that there is a concurrent (unpublished) work from Jianfend Wang, that shares the key idea about Label Assignment Distillation. However, both of our works are independent and original. We would like to acknowledge his work and thank for his help to clarify the issue.
Owner
Cybercore Co. Ltd
Cybercore Co. Ltd
A very impractical 3D rendering engine that runs in the python terminal.

Terminal-3D-Render A very impractical 3D rendering engine that runs in the python terminal. do NOT try to run this program using the standard python I

23 Dec 31, 2022
TransPrompt - Towards an Automatic Transferable Prompting Framework for Few-shot Text Classification

TransPrompt This code is implement for our EMNLP 2021's paper 《TransPrompt:Towards an Automatic Transferable Prompting Framework for Few-shot Text Cla

WangJianing 23 Dec 21, 2022
BRNet - code for Automated assessment of BI-RADS categories for ultrasound images using multi-scale neural networks with an order-constrained loss function

BRNet code for "Automated assessment of BI-RADS categories for ultrasound images using multi-scale neural networks with an order-constrained loss func

Yong Pi 2 Mar 09, 2022
Prososdy Morph: A python library for manipulating pitch and duration in an algorithmic way, for resynthesizing speech.

ProMo (Prosody Morph) Questions? Comments? Feedback? Chat with us on gitter! A library for manipulating pitch and duration in an algorithmic way, for

Tim 71 Jan 02, 2023
Unsupervised Image-to-Image Translation

UNIT: UNsupervised Image-to-image Translation Networks Imaginaire Repository We have a reimplementation of the UNIT method that is more performant. It

Ming-Yu Liu 劉洺堉 1.9k Dec 26, 2022
AgeGuesser: deep learning based age estimation system. Powered by EfficientNet and Yolov5

AgeGuesser AgeGuesser is an end-to-end, deep-learning based Age Estimation system, presented at the CAIP 2021 conference. You can find the related pap

5 Nov 10, 2022
Pytorch code for paper "Image Compressed Sensing Using Non-local Neural Network" TMM 2021.

NL-CSNet-Pytorch Pytorch code for paper "Image Compressed Sensing Using Non-local Neural Network" TMM 2021. Note: this repo only shows the strategy of

WenxueCui 7 Nov 07, 2022
Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning

Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning This is the official repository for Conservative and Adaptive Penalty fo

7 Nov 22, 2022
CoCosNet v2: Full-Resolution Correspondence Learning for Image Translation

CoCosNet v2: Full-Resolution Correspondence Learning for Image Translation (CVPR 2021, oral presentation) CoCosNet v2: Full-Resolution Correspondence

Microsoft 308 Dec 07, 2022
Unofficial implementation of MLP-Mixer: An all-MLP Architecture for Vision

MLP-Mixer: An all-MLP Architecture for Vision This repo contains PyTorch implementation of MLP-Mixer: An all-MLP Architecture for Vision. Usage : impo

Rishikesh (ऋषिकेश) 175 Dec 23, 2022
Code for the AAAI-2022 paper: Imagine by Reasoning: A Reasoning-Based Implicit Semantic Data Augmentation for Long-Tailed Classification

Imagine by Reasoning: A Reasoning-Based Implicit Semantic Data Augmentation for Long-Tailed Classification (AAAI 2022) Prerequisite PyTorch = 1.2.0 P

16 Dec 14, 2022
Multi-label Co-regularization for Semi-supervised Facial Action Unit Recognition (NeurIPS 2019)

MLCR This is the source code for paper Multi-label Co-regularization for Semi-supervised Facial Action Unit Recognition. Xuesong Niu, Hu Han, Shiguang

Edson-Niu 60 Nov 29, 2022
Implementation of Kronecker Attention in Pytorch

Kronecker Attention Pytorch Implementation of Kronecker Attention in Pytorch. Results look less than stellar, but if someone found some context where

Phil Wang 16 May 06, 2022
Official Pytorch implementation of "Beyond Static Features for Temporally Consistent 3D Human Pose and Shape from a Video", CVPR 2021

TCMR: Beyond Static Features for Temporally Consistent 3D Human Pose and Shape from a Video Qualtitative result Paper teaser video Introduction This r

Hongsuk Choi 215 Jan 06, 2023
GAN Image Generator and Characterwise Image Recognizer with python

MODEL SUMMARY 모델의 구조는 크게 6단계로 나뉩니다. STEP 0: Input Image Predict 할 이미지를 모델에 입력합니다. STEP 1: Make Black and White Image STEP 1 은 입력받은 이미지의 글자를 흑색으로, 배경을

Juwan HAN 1 Feb 09, 2022
LeafSnap replicated using deep neural networks to test accuracy compared to traditional computer vision methods.

Deep-Leafsnap Convolutional Neural Networks have become largely popular in image tasks such as image classification recently largely due to to Krizhev

Sujith Vishwajith 48 Nov 27, 2022
Pytorch Implementation of Residual Vision Transformers(ResViT)

ResViT Official Pytorch Implementation of Residual Vision Transformers(ResViT) which is described in the following paper: Onat Dalmaz and Mahmut Yurt

ICON Lab 41 Dec 08, 2022
Hand Gesture Volume Control | Open CV | Computer Vision

Gesture Volume Control Hand Gesture Volume Control | Open CV | Computer Vision Use gesture control to change the volume of a computer. First we look i

Jhenil Parihar 3 Jun 15, 2022
This repository contains all code and data for the Inside Out Visual Place Recognition task

Inside Out Visual Place Recognition This repository contains code and instructions to reproduce the results for the Inside Out Visual Place Recognitio

15 May 21, 2022
Rapid experimentation and scaling of deep learning models on molecular and crystal graphs.

LitMatter A template for rapid experimentation and scaling deep learning models on molecular and crystal graphs. How to use Clone this repository and

Nathan Frey 32 Dec 06, 2022