A Fast Knowledge Distillation Framework for Visual Recognition

Overview

FKD: A Fast Knowledge Distillation Framework for Visual Recognition

Official PyTorch implementation of paper A Fast Knowledge Distillation Framework for Visual Recognition. Zhiqiang Shen and Eric Xing from CMU and MUZUAI.

Abstract

Knowledge Distillation (KD) has been recognized as a useful tool in many visual tasks, such as the supervised classification and self-supervised representation learning, while the main drawback of a vanilla KD framework lies in its mechanism that most of the computational overhead is consumed on forwarding through the giant teacher networks, which makes the whole learning procedure in a low-efficient and costly manner. In this work, we propose a Fast Knowledge Distillation (FKD) framework that simulates the distillation training phase and generates soft labels following the multi-crop KD procedure, meanwhile enjoying the faster training speed than ReLabel as we have no post-processes like RoI align and softmax operations. Our FKD is even more efficient than the conventional classification framework when employing multi-crop in the same image for data loading. We achieve 79.8% using ResNet-50 on ImageNet-1K, outperforming ReLabel by ~1.0% while being faster. We also demonstrate the efficiency advantage of FKD on the self-supervised learning task.

Supervised Training

Preparation

FKD Training on CNNs

To train a model, run train_FKD.py with the desired model architecture and the path to the soft label and ImageNet dataset:

python train_FKD.py -a resnet50 --lr 0.1 --num_crops 4 -b 1024 --cos --softlabel_path [soft label path] [imagenet-folder with train and val folders]

For --softlabel_path, simply use format as ./FKD_soft_label_500_crops_marginal_smoothing_k_5

Multi-processing distributed training is supported, please refer to official PyTorch ImageNet training code for details.

Evaluation

python train_FKD.py -a resnet50 -e --resume [model path] [imagenet-folder with train and val folders]

Trained Models

Model accuracy (Top-1) weights configurations
ReLabel ResNet-50 78.9 -- --
FKD ResNet-50 79.8 link Table 10 in paper
ReLabel ResNet-101 80.7 -- --
FKD ResNet-101 81.7 link Table 10 in paper

FKD Training on ViT/DeiT and SReT

To train a ViT model, run train_ViT_FKD.py with the desired model architecture and the path to the soft label and ImageNet dataset:

cd train_ViT
python train_ViT_FKD.py -a SReT_LT --lr 0.002 --wd 0.05 --num_crops 4 -b 1024 --cos --softlabel_path [soft label path] [imagenet-folder with train and val folders]

For the instructions of SReT_LT model, please refer to SReT for details.

Evaluation

python train_ViT_FKD.py -a SReT_LT -e --resume [model path] [imagenet-folder with train and val folders]

Trained Models

Model FLOPs #params accuracy (Top-1) weights configurations
DeiT-T-distill 1.3B 5.7M 74.5 -- --
FKD ViT/DeiT-T 1.3B 5.7M 75.2 link Table 11 in paper
SReT-LT-distill 1.2B 5.0M 77.7 -- --
FKD SReT-LT 1.2B 5.0M 78.7 link Table 11 in paper

Fast MEAL V2

Please see MEAL V2 for the instructions to run FKD with MEAL V2.

Self-supervised Representation Learning Using FKD

Please see FKD-SSL for the instructions to run FKD code for SSL task.

Citation

@article{shen2021afast,
      title={A Fast Knowledge Distillation Framework for Visual Recognition}, 
      author={Zhiqiang Shen and Eric Xing},
      year={2021},
      journal={arXiv preprint arXiv:2112.01528}
}

Contact

Zhiqiang Shen (zhiqians at andrew.cmu.edu or zhiqiangshen0214 at gmail.com)

Owner
Zhiqiang Shen
Zhiqiang Shen
Dynamic Neural Representational Decoders for High-Resolution Semantic Segmentation

Dynamic Neural Representational Decoders for High-Resolution Semantic Segmentation Requirements This repository needs mmsegmentation Training To train

Adelaide Intelligent Machines (AIM) Group 7 Sep 12, 2022
Collection of generative models in Pytorch version.

pytorch-generative-model-collections Original : [Tensorflow version] Pytorch implementation of various GANs. This repository was re-implemented with r

Hyeonwoo Kang 2.4k Dec 31, 2022
Python library for computer vision labeling tasks. The core functionality is to translate bounding box annotations between different formats-for example, from coco to yolo.

PyLabel pip install pylabel PyLabel is a Python package to help you prepare image datasets for computer vision models including PyTorch and YOLOv5. I

PyLabel Project 176 Jan 01, 2023
Implementation of UNET architecture for Image Segmentation.

Semantic Segmentation using UNET This is the implementation of UNET on Carvana Image Masking Kaggle Challenge About the Dataset This dataset contains

Anushka agarwal 4 Dec 21, 2021
Code associated with the paper "Deep Optics for Single-shot High-dynamic-range Imaging"

Deep Optics for Single-shot High-dynamic-range Imaging Code associated with the paper "Deep Optics for Single-shot High-dynamic-range Imaging" CVPR, 2

Stanford Computational Imaging Lab 40 Dec 12, 2022
DeepMind's software stack for physics-based simulation and Reinforcement Learning environments, using MuJoCo.

dm_control: DeepMind Infrastructure for Physics-Based Simulation. DeepMind's software stack for physics-based simulation and Reinforcement Learning en

DeepMind 3k Dec 31, 2022
source code for https://arxiv.org/abs/2005.11248 "Accelerating Antimicrobial Discovery with Controllable Deep Generative Models and Molecular Dynamics"

Accelerating Antimicrobial Discovery with Controllable Deep Generative Models and Molecular Dynamics This work will be published in Nature Biomedical

International Business Machines 71 Nov 15, 2022
一个多模态内容理解算法框架,其中包含数据处理、预训练模型、常见模型以及模型加速等模块。

Overview 架构设计 插件介绍 安装使用 框架简介 方便使用,支持多模态,多任务的统一训练框架 能力列表: bert + 分类任务 自定义任务训练(插件注册) 框架设计 框架采用分层的思想组织模型训练流程。 DATA 层负责读取用户数据,根据 field 管理数据。 Parser 层负责转换原

Tencent 265 Dec 22, 2022
A script that trains a model to recognize handwritten digits using the MNIST data set.

handwritten-digits-recognition A script that trains a model to recognize handwritten digits using the MNIST data set. Then it loads external files and

Hamza Sayih 1 Oct 30, 2021
It is an open dataset for object detection in remote sensing images.

RSOD-Dataset It is an open dataset for object detection in remote sensing images. The dataset includes aircraft, oiltank, playground and overpass. The

136 Dec 08, 2022
Polynomial-time Meta-Interpretive Learning

Louise - polynomial-time Program Learning Getting help with Louise Louise's author can be reached by email at Stassa Patsantzis 64 Dec 26, 2022

A Blender python script for getting asset browser custom preview images for objects and collections.

asset_snapshot A Blender python script for getting asset browser custom preview images for objects and collections. Installation: Click the code butto

Johnny Matthews 44 Nov 29, 2022
Wanli Li and Tieyun Qian: Exploit a Multi-head Reference Graph for Semi-supervised Relation Extraction, IJCNN 2021

MRefG Wanli Li and Tieyun Qian: "Exploit a Multi-head Reference Graph for Semi-supervised Relation Extraction", IJCNN 2021 1. Requirements To reproduc

万理 5 Jul 26, 2022
A code implementation of AC-GC: Activation Compression with Guaranteed Convergence, in NeurIPS 2021.

Code For AC-GC: Lossy Activation Compression with Guaranteed Convergence This code is intended to be used as a supplemental material for submission to

Dave Evans 2 Nov 01, 2022
PyTorch implementation of Towards Accurate Alignment in Real-time 3D Hand-Mesh Reconstruction (ICCV 2021).

Towards Accurate Alignment in Real-time 3D Hand-Mesh Reconstruction Introduction This is official PyTorch implementation of Towards Accurate Alignment

TANG Xiao 96 Dec 27, 2022
Official repository of Semantic Image Matting

Semantic Image Matting This is the official repository of Semantic Image Matting (CVPR2021). Overview Natural image matting separates the foreground f

192 Dec 29, 2022
Easy genetic ancestry predictions in Python

ezancestry Easily visualize your direct-to-consumer genetics next to 2500+ samples from the 1000 genomes project. Evaluate the performance of a custom

Kevin Arvai 38 Jan 02, 2023
Self-Supervised Monocular DepthEstimation with Internal Feature Fusion(arXiv), BMVC2021

DIFFNet This repo is for Self-Supervised Monocular DepthEstimation with Internal Feature Fusion(arXiv), BMVC2021 A new backbone for self-supervised de

Hang 94 Dec 25, 2022
Wordplay, an artificial Intelligence based crossword puzzle solver.

Wordplay, AI based crossword puzzle solver A crossword is a word puzzle that usually takes the form of a square or a rectangular grid of white- and bl

Vaibhaw 4 Nov 16, 2022
A PyTorch implementation of "Pathfinder Discovery Networks for Neural Message Passing"

A PyTorch implementation of "Pathfinder Discovery Networks for Neural Message Passing" (WebConf 2021). Abstract In this work we propose Pathfind

Benedek Rozemberczki 49 Dec 01, 2022