Improving Object Detection by Label Assignment Distillation

Related tags

Deep LearningCoLAD
Overview

Improving Object Detection by Label Assignment Distillation

This is the official implementation of the WACV 2022 paper Improving Object Detection by Label Assignment Distillation. We provide the code for Label Assignement Distillation (LAD), training logs and several model checkpoints.

Table of Contents

  1. Introduction
  2. Installation
  3. Usage
  4. Experiments
  5. Citation

Introduction

This is the official repository for the paper Improving Object Detection by Label Assignment Distillation.

Soft Label Distillation concept (a) Label Assignment Distillation concept (b)
  • Distillation in Object Detection is typically achived by mimicking the teacher's output directly, such soft-label distillation or feature mimicking (Fig 1.a).
  • We propose the concept of Label Assignment Distillation (LAD), which solves the label assignment problems from distillation perspective, thus allowing the student learns from the teacher's knowledge without direct mimicking (Fig 1.b). LAD is very general, and applied to many dynamic label assignment methods. Following figure shows a concrete example of how to adopt Probabilistic Anchor Assignment (PAA) to LAD.
Probabilistic Anchor Assignment (PAA) Label Assignment Distillation (LAD) based on PAA
  • We demonstrate a number of advantages of LAD, notably that it is very simple and effective, flexible to use with most of detectors, and complementary to other distillation techniques.
  • Later, we introduced the Co-learning dynamic Label Assignment Distillation (CoLAD) to allow two networks to be trained mutually based on a dynamic switching criterion. We show that two networks trained with CoLAD are significantly better than if each was trained individually, given the same initialization.


Installation

  • Create environment:
conda create -n lad python=3.7 -y
conda activate lad
  • Install dependencies:
conda install pytorch=1.7.0 torchvision cudatoolkit=10.2 -c pytorch -y
pip install openmim future tensorboard sklearn timm==0.3.4
mim install mmcv-full==1.2.5
mim install mmdet==2.10.0
pip install -e ./

Usage

Train the model

#!/usr/bin/env bash
set -e
export GPUS=2
export CUDA_VISIBLE_DEVICES=0,2

CFG="configs/lad/paa_lad_r50_r101p1x_1x_coco.py"
WORKDIR="/checkpoints/lad/paa_lad_r50_r101p1x_1x_coco"

mim train mmdet $CFG --work-dir $WORKDIR \
    --gpus=$GPUS --launcher pytorch --seed 0 --deterministic

Test the model

#!/usr/bin/env bash
set -e
export GPUS=2
export CUDA_VISIBLE_DEVICES=0,2

CFG="configs/paa/paa_lad_r50_r101p1x_1x_coco.py"
CKPT="/checkpoints/lad/paa_lad_r50_r101p1x_1x_coco/epoch_12.pth"

mim test mmdet $CFG --checkpoint $CKPT --gpus $GPUS --launcher pytorch --eval bbox

Experiments

1. A Pilot Study - Preliminary Comparison.

Table 2: Compare the performance of the student PAA-R50 using Soft-Label, Label Assignment Distillation (LAD) and their combination (SoLAD) on COCO validation set.

Method Teacher Student gamma mAP Improve Config Download
Baseline None PAA-R50 (baseline) 2 40.4 - config model | log
Soft-Label-KL loss PAA-R101 PAA-R50 0.5 41.3 +0.9 config model | log
LAD (ours) PAA-R101 PAA-R50 2 41.6 +1.2 config model | log
SoLAD(ours) PAA-R101 PAA-R50 0.5 42.4 +2.0 config model | log

2. A Pilot Study - Does LAD need a bigger teacher network?

Table 3: Compare Soft-Label and Label Assignment Distillation (LAD) on COCO validation set. Teacher and student use ResNet50 and ResNet101 backbone, respectively. 2× denotes the 2× training schedule.

Method Teacher Student mAP Improve Config Download
Baseline None PAA-R50 40.4 - config model | log
Baseline (1x) None PAA-R101 42.6 - config model | log
Baseline (2x) None PAA-R101 43.5 +0.9 config model | log
Soft-Label PAA-R50 PAA-R101 40.4 -2.2 config model | log
LAD (our) PAA-R50 PAA-R101 43.3 +0.7 config model | log

3. Compare with State-ot-the-art Label Assignment methods

We use the PAA-R50 3x pretrained with multi-scale on COCO as the initial teacher. The teacher was evaluated with 43:3AP on the minval set. We train our network with the COP branch, and the post-processing steps are similar to PAA.

  • Table 7.1 Backbone ResNet-101, train 2x schedule, Multi-scale training. Results are evaluated on the COCO testset.
Method AP AP50 AP75 APs APm APl Download
FCOS 41.5 60.7 45.0 24.4 44.8 51.6
NoisyAnchor 41.8 61.1 44.9 23.4 44.9 52.9
FreeAnchor 43.1 62.2 46.4 24.5 46.1 54.8
SAPD 43.5 63.6 46.5 24.9 46.8 54.6
MAL 43.6 61.8 47.1 25.0 46.9 55.8
ATSS 43.6 62.1 47.4 26.1 47.0 53.6
AutoAssign 44.5 64.3 48.4 25.9 47.4 55.0
PAA 44.8 63.3 48.7 26.5 48.8 56.3
OTA 45.3 63.5 49.3 26.9 48.8 56.1
IQDet 45.1 63.4 49.3 26.7 48.5 56.6
CoLAD (ours) 46.0 64.4 50.6 27.9 49.9 57.3 config | model | log

4. Appendix - Ablation Study of Conditional Objectness Prediction (COP)

Table 1 - Appendix: Compare different auxiliary predictions: IoU, Implicit Object prediction (IOP), and Conditional Objectness prediction (COP), with ResNet-18 and ResNet-50 backbones. Experiments on COCO validate set.

IoU IOP COP ResNet-18 ResNet-50
✔️ 35.8 (config | model | log) 40.4 (config | model | log)
✔️ ✔️ 36.7 (config | model | log) 41.6 (config | model | log)
✔️ ✔️ 36.9 (config | model | log) 41.6 (config | model | log)
✔️ 36.6 (config | model | log) 41.1 (config | model | log)
✔️ 36.9 (config | model | log) 41.2 (config | model | log)

Citation

Please cite the paper in your publications if it helps your research:

@misc{nguyen2021improving,
      title={Improving Object Detection by Label Assignment Distillation}, 
      author={Chuong H. Nguyen and Thuy C. Nguyen and Tuan N. Tang and Nam L. H. Phan},
      year={2021},
      eprint={2108.10520},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}
  • On Sep 25 2021, we found that there is a concurrent (unpublished) work from Jianfend Wang, that shares the key idea about Label Assignment Distillation. However, both of our works are independent and original. We would like to acknowledge his work and thank for his help to clarify the issue.
Owner
Cybercore Co. Ltd
Cybercore Co. Ltd
Repositório da disciplina de APC, no segundo semestre de 2021

NOTAS FINAIS: https://github.com/fabiommendes/apc2018/blob/master/nota-final.pdf Algoritmos e Programação de Computadores Este é o Git da disciplina A

16 Dec 16, 2022
Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-based network. Including examples for DETR, VQA.

PyTorch Implementation of Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers 1 Using Colab Please notic

Hila Chefer 489 Jan 07, 2023
Official Pytorch Implementation of 3DV2021 paper: SAFA: Structure Aware Face Animation.

SAFA: Structure Aware Face Animation (3DV2021) Official Pytorch Implementation of 3DV2021 paper: SAFA: Structure Aware Face Animation. Getting Started

QiulinW 122 Dec 23, 2022
A python implementation of Physics-informed Spline Learning for nonlinear dynamics discovery

PiSL A python implementation of Physics-informed Spline Learning for nonlinear dynamics discovery. Sun, F., Liu, Y. and Sun, H., 2021. Physics-informe

Fangzheng (Andy) Sun 8 Jul 13, 2022
TransferNet: Learning Transferrable Knowledge for Semantic Segmentation with Deep Convolutional Neural Network

TransferNet: Learning Transferrable Knowledge for Semantic Segmentation with Deep Convolutional Neural Network Created by Seunghoon Hong, Junhyuk Oh,

42 Jun 29, 2022
Implementation of ToeplitzLDA for spatiotemporal stationary time series data.

Code for the ToeplitzLDA classifier proposed in here. The classifier conforms sklearn and can be used as a drop-in replacement for other LDA classifiers. For in-depth usage refer to the learning from

Jan Sosulski 5 Nov 07, 2022
This repo is customed for VisDrone.

Object Detection for VisDrone(无人机航拍图像目标检测) My environment 1、Windows10 (Linux available) 2、tensorflow = 1.12.0 3、python3.6 (anaconda) 4、cv2 5、ensemble

53 Jul 17, 2022
Randstad Artificial Intelligence Challenge (powered by VGEN). Soluzione proposta da Stefano Fiorucci (anakin87) - primo classificato

Randstad Artificial Intelligence Challenge (powered by VGEN) Soluzione proposta da Stefano Fiorucci (anakin87) - primo classificato Struttura director

Stefano Fiorucci 1 Nov 13, 2021
Painting app using Python machine learning and vision technology.

AI Painting App We are making an app that will track our hand and helps us to draw from that. We will be using the advance knowledge of Machine Learni

Badsha Laskar 3 Oct 03, 2022
Implementation of Uniformer, a simple attention and 3d convolutional net that achieved SOTA in a number of video classification tasks

Uniformer - Pytorch Implementation of Uniformer, a simple attention and 3d convolutional net that achieved SOTA in a number of video classification ta

Phil Wang 90 Nov 24, 2022
A naive ROS interface for visualDet3D.

YOLO3D ROS Node This repo contains a Monocular 3D detection Ros node. Base on https://github.com/Owen-Liuyuxuan/visualDet3D All parameters are exposed

Yuxuan Liu 19 Oct 08, 2022
Brain Tumor Detection with Tensorflow Neural Networks.

Brain-Tumor-Detection A convolutional neural network model built with Tensorflow & Keras to detect brain tumor and its different variants. Data of the

404ErrorNotFound 5 Aug 23, 2022
Rethinking Space-Time Networks with Improved Memory Coverage for Efficient Video Object Segmentation

STCN Rethinking Space-Time Networks with Improved Memory Coverage for Efficient Video Object Segmentation Ho Kei Cheng, Yu-Wing Tai, Chi-Keung Tang [a

Rex Cheng 456 Dec 12, 2022
RefineNet: Multi-Path Refinement Networks for High-Resolution Semantic Segmentation

Multipath RefineNet A MATLAB based framework for semantic image segmentation and general dense prediction tasks on images. This is the source code for

Guosheng Lin 575 Dec 06, 2022
Selfplay In MultiPlayer Environments

This project allows you to train AI agents on custom-built multiplayer environments, through self-play reinforcement learning.

200 Jan 08, 2023
PyExplainer: A Local Rule-Based Model-Agnostic Technique (Explainable AI)

PyExplainer PyExplainer is a local rule-based model-agnostic technique for generating explanations (i.e., why a commit is predicted as defective) of J

AI Wizards for Software Management (AWSM) Research Group 14 Nov 13, 2022
An extremely simple, intuitive, hardware-friendly, and well-performing network structure for LiDAR semantic segmentation on 2D range image. IROS21

FIDNet_SemanticKITTI Motivation Implementing complicated network modules with only one or two points improvement on hardware is tedious. So here we pr

YimingZhao 54 Dec 12, 2022
Authors implementation of LieTransformer: Equivariant Self-Attention for Lie Groups

LieTransformer This repository contains the implementation of the LieTransformer used for experiments in the paper LieTransformer: Equivariant self-at

35 Oct 18, 2022
Official repository for MixFaceNets: Extremely Efficient Face Recognition Networks

MixFaceNets This is the official repository of the paper: MixFaceNets: Extremely Efficient Face Recognition Networks. (Accepted in IJCB2021) https://i

Fadi Boutros 51 Dec 13, 2022
Ensembling Off-the-shelf Models for GAN Training

Data-Efficient GANs with DiffAugment project | paper | datasets | video | slides Generated using only 100 images of Obama, grumpy cats, pandas, the Br

MIT HAN Lab 1.2k Dec 26, 2022