IEEE Winter Conference on Applications of Computer Vision 2022 Accepted

Last update: Nov 17, 2022

Overview

SSKT(Accepted WACV2022)

Concept map

Dataset

Image dataset
- CIFAR10 (torchvision)
- CIFAR100 (torchvision)
- STL10 (torchvision)
- Pascal VOC (torchvision)
- ImageNet(I) (torchvision)
- Places365(P)
Video dataset
- UCF101
- HMDB51

Pre-trained models

Imagenet
- we used the pre-trained model in torchvision.
- using resnet18, 50
Places365
- Resnet50

Option

isSource
- Single Source Transfer Module
- Transfer Module X, Only using auxiliary layer
transfer_module
- Single Source Transfer Module
multi_source
- multiple task transfer learning

Training

2D PreLeKT

 python main.py --model resnet20  --source_arch resnet50 --sourceKind places365 --result /raid/video_data/output/PreLeKT --dataset stl10 --lr 0.1 --wd 5e-4 --epochs 200 --classifier_loss_method ce --auxiliary_loss_method kd --isSource --multi_source --transfer_module

3D PreLeKT

 python main.py --root_path /raid/video_data/ucf101/ --video_path frames --annotation_path ucf101_01.json  --result_path /raid/video_data/output/PreLeKT --n_classes 400 --n_finetune_classes 101 --model resnet --model_depth 18 --resnet_shortcut A --batch_size 128 --n_threads 4 --pretrain_path /nvadmin/Pretrained_model/resnet-18-kinetics.pth --ft_begin_index 4 --dataset ucf101 --isSource --transfer_module --multi_source

Experiment

Comparison with other knowledge transfer methods.

For a further analysis of SSKT, we compared its performance with those of typical knowledge transfer methods, namely KD[1] and DML[3]
For KD, the details for learning were set the same as in [1], and for DML, training was performed in the same way as in [3].
In the case of 3D-CNN-based action classification[2], both learning from scratch and fine tuning results were included

T_t	Model	KD	DML	SSKT(T_s)
CIFAR10	ResNet20	91.75±0.24	92.37±0.15	92.46±0.15 (P+I)
CIFAR10	ResNet32	92.61±0.31	93.26±0.21	93.38±0.02 (P+I)
CIFAR100	ResNet20	68.66±0.24	69.48±0.05	68.63±0.12 (I)
CIFAR100	ResNet32	70.5±0.05	71.9±0.03	70.94±0.36 (P+I)
STL10	ResNet20	77.67±1.41	78.23±1.23	84.56±0.35 (P+I)
STL10	ResNet32	76.07±0.67	77.14±1.64	83.68±0.28 (I)
VOC	ResNet18	64.11±0.18	39.89±0.07	76.42±0.06 (P+I)
VOC	ResNet34	64.57±0.12	39.97±0.16	77.02±0.02 (P+I)
VOC	ResNet50	62.39±0.6	39.65±0.03	77.1±0.14 (P+I)
UCF101	3D ResNet18(scratch)	-	13.8	52.19(P+I)
UCF101	3D ResNet18(fine-tuning)	-	83.95	84.58 (P)
HMDB51	3D ResNet18(scratch)	-	3.01	17.91 (P+I)
HMDB51	3D ResNet18(fine-tuning)	-	56.44	57.82 (P)

The performance comparison with MAXL[4], another auxiliary learning-based transfer learning method

The difference between the learning scheduler in MAXL and in our experiment is whether cosine annealing scheduler and focal loss are used or not.
In VGG16, SSKT showed better performance in all settings. In ResNet20, we also showed better performance in our settings than MAXL in all settings.

T_t	Model	MAXL (ψ[i])	SSKT (T_s, Loss )	T_s Model
CIFAR10	VGG16	93.49±0.05 (5)	94.1±0.1 (I, F)	VGG16
CIFAR10	VGG16	-	94.22±0.02 (I, CE)	VGG16
CIFAR10	ResNet20	91.56±0.16 (10)	91.48±0.03 (I, F)	VGG16
CIFAR10	ResNet20	-	92.46±0.15 (P+I, CE)	ResNet50, ResNet50

Citation

If you use SSKD in your research, please consider citing:

@InProceedings{SSKD_2022_WACV,
author = {Seungbum Hong, Jihun Yoon, and Min-Kook Choi},
title = {Self-Supervised Knowledge Transfer via Loosely Supervised Auxiliary Tasks},
booktitle = {In The IEEE Winter Conference on Applications of Computer Vision (WACV)},
month = {January},
year = {2022}
}

References

[1] Hinton et al. - "Distilling the knowledge in a neural network" (NIPSW, 2014)
[2] Hara et al. - "Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?" (CVPR, 2018)
[3] Zhang et al. - "Deep mutual learning" (CVPR, 2018)
[4] Davison and Johns - "Self-Supervised Generalisation with Meta Auxiliary Learning" (NeurIPS, 2019)

IEEE Winter Conference on Applications of Computer Vision 2022 Accepted

Related tags

Overview

SSKT(Accepted WACV2022)

Concept map

Dataset

Pre-trained models

Option

Training

Experiment

Comparison with other knowledge transfer methods.

The performance comparison with MAXL[4], another auxiliary learning-based transfer learning method

Citation

References

Owner

[CVPR 2021 Oral] Variational Relational Point Completion Network

CountDown to New Year and shoot fireworks

VIMPAC: Video Pre-Training via Masked Token Prediction and Contrastive Learning

Harmonic Memory Networks for Graph Completion

PyTorch implementation for NED. It can be used to manipulate the facial emotions of actors in videos based on emotion labels or reference styles.

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation

🔀 Visual Room Rearrangement

The Adapter-Bot: All-In-One Controllable Conversational Model

Extending JAX with custom C++ and CUDA code

scikit-learn: machine learning in Python

Official code for "InfoGraph: Unsupervised and Semi-supervised Graph-Level Representation Learning via Mutual Information Maximization" (ICLR 2020, spotlight)

Generate indoor scenes with Transformers

The easiest way to use deep metric learning in your application. Modular, flexible, and extensible. Written in PyTorch.

SlideGraph+: Whole Slide Image Level Graphs to Predict HER2 Status in Breast Cancer

MAterial del programa Misión TIC 2022

CLIP2Video: Mastering Video-Text Retrieval via Image CLIP

Prototypical Cross-Attention Networks for Multiple Object Tracking and Segmentation, NeurIPS 2021 Spotlight

Source code for paper "ATP: AMRize Than Parse! Enhancing AMR Parsing with PseudoAMRs" @NAACL-2022

BMN: Boundary-Matching Network