IEEE Winter Conference on Applications of Computer Vision 2022 Accepted

Overview

SSKT(Accepted WACV2022)

Concept map

concept

Dataset

  • Image dataset
    • CIFAR10 (torchvision)
    • CIFAR100 (torchvision)
    • STL10 (torchvision)
    • Pascal VOC (torchvision)
    • ImageNet(I) (torchvision)
    • Places365(P)
  • Video dataset

Pre-trained models

  • Imagenet
    • we used the pre-trained model in torchvision.
    • using resnet18, 50
  • Places365

Option

  • isSource
    • Single Source Transfer Module
    • Transfer Module X, Only using auxiliary layer
  • transfer_module
    • Single Source Transfer Module
  • multi_source
    • multiple task transfer learning

Training

  • 2D PreLeKT
 python main.py --model resnet20  --source_arch resnet50 --sourceKind places365 --result /raid/video_data/output/PreLeKT --dataset stl10 --lr 0.1 --wd 5e-4 --epochs 200 --classifier_loss_method ce --auxiliary_loss_method kd --isSource --multi_source --transfer_module
  • 3D PreLeKT
 python main.py --root_path /raid/video_data/ucf101/ --video_path frames --annotation_path ucf101_01.json  --result_path /raid/video_data/output/PreLeKT --n_classes 400 --n_finetune_classes 101 --model resnet --model_depth 18 --resnet_shortcut A --batch_size 128 --n_threads 4 --pretrain_path /nvadmin/Pretrained_model/resnet-18-kinetics.pth --ft_begin_index 4 --dataset ucf101 --isSource --transfer_module --multi_source

Experiment

Comparison with other knowledge transfer methods.

  • For a further analysis of SSKT, we compared its performance with those of typical knowledge transfer methods, namely KD[1] and DML[3]
  • For KD, the details for learning were set the same as in [1], and for DML, training was performed in the same way as in [3].
  • In the case of 3D-CNN-based action classification[2], both learning from scratch and fine tuning results were included
Tt Model KD DML SSKT(Ts)
CIFAR10 ResNet20 91.75±0.24 92.37±0.15 92.46±0.15 (P+I)
CIFAR10 ResNet32 92.61±0.31 93.26±0.21 93.38±0.02 (P+I)
CIFAR100 ResNet20 68.66±0.24 69.48±0.05 68.63±0.12 (I)
CIFAR100 ResNet32 70.5±0.05 71.9±0.03 70.94±0.36 (P+I)
STL10 ResNet20 77.67±1.41 78.23±1.23 84.56±0.35 (P+I)
STL10 ResNet32 76.07±0.67 77.14±1.64 83.68±0.28 (I)
VOC ResNet18 64.11±0.18 39.89±0.07 76.42±0.06 (P+I)
VOC ResNet34 64.57±0.12 39.97±0.16 77.02±0.02 (P+I)
VOC ResNet50 62.39±0.6 39.65±0.03 77.1±0.14 (P+I)
UCF101 3D ResNet18(scratch) - 13.8 52.19(P+I)
UCF101 3D ResNet18(fine-tuning) - 83.95 84.58 (P)
HMDB51 3D ResNet18(scratch) - 3.01 17.91 (P+I)
HMDB51 3D ResNet18(fine-tuning) - 56.44 57.82 (P)

The performance comparison with MAXL[4], another auxiliary learning-based transfer learning method

  • The difference between the learning scheduler in MAXL and in our experiment is whether cosine annealing scheduler and focal loss are used or not.
  • In VGG16, SSKT showed better performance in all settings. In ResNet20, we also showed better performance in our settings than MAXL in all settings.
Tt Model MAXL (ψ[i]) SSKT (Ts, Loss ) Ts Model
CIFAR10 VGG16 93.49±0.05 (5) 94.1±0.1 (I, F) VGG16
CIFAR10 VGG16 - 94.22±0.02 (I, CE) VGG16
CIFAR10 ResNet20 91.56±0.16 (10) 91.48±0.03 (I, F) VGG16
CIFAR10 ResNet20 - 92.46±0.15 (P+I, CE) ResNet50, ResNet50

Citation

If you use SSKD in your research, please consider citing:

@InProceedings{SSKD_2022_WACV,
author = {Seungbum Hong, Jihun Yoon, and Min-Kook Choi},
title = {Self-Supervised Knowledge Transfer via Loosely Supervised Auxiliary Tasks},
booktitle = {In The IEEE Winter Conference on Applications of Computer Vision (WACV)},
month = {January},
year = {2022}
}

References

[CVPR 2021 Oral] Variational Relational Point Completion Network

VRCNet: Variational Relational Point Completion Network This repository contains the PyTorch implementation of the paper: Variational Relational Point

PL 121 Dec 12, 2022
CountDown to New Year and shoot fireworks

CountDown and Shoot Fireworks About App This is an small application make you re

5 Dec 31, 2022
VIMPAC: Video Pre-Training via Masked Token Prediction and Contrastive Learning

This is a release of our VIMPAC paper to illustrate the implementations. The pretrained checkpoints and scripts will be soon open-sourced in HuggingFace transformers.

Hao Tan 74 Dec 03, 2022
Harmonic Memory Networks for Graph Completion

HMemNetworks Code and documentation for Harmonic Memory Networks, a series of models for compositionally assembling representations of graph elements

mlalisse 0 Oct 27, 2021
PyTorch implementation for NED. It can be used to manipulate the facial emotions of actors in videos based on emotion labels or reference styles.

Neural Emotion Director (NED) - Official Pytorch Implementation Example video of facial emotion manipulation while retaining the original mouth motion

Foivos Paraperas 89 Dec 23, 2022
Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

Apache MXNet (incubating) for Deep Learning Apache MXNet is a deep learning framework designed for both efficiency and flexibility. It allows you to m

The Apache Software Foundation 20.2k Jan 05, 2023
Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation

Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation This paper has been accepted and early accessed

Yun Liu 39 Sep 20, 2022
🔀 Visual Room Rearrangement

AI2-THOR Rearrangement Challenge Welcome to the 2021 AI2-THOR Rearrangement Challenge hosted at the CVPR'21 Embodied-AI Workshop. The goal of this cha

AI2 55 Dec 22, 2022
The Adapter-Bot: All-In-One Controllable Conversational Model

The Adapter-Bot: All-In-One Controllable Conversational Model This is the implementation of the paper: The Adapter-Bot: All-In-One Controllable Conver

CAiRE 37 Nov 04, 2022
Extending JAX with custom C++ and CUDA code

Extending JAX with custom C++ and CUDA code This repository is meant as a tutorial demonstrating the infrastructure required to provide custom ops in

Dan Foreman-Mackey 237 Dec 23, 2022
scikit-learn: machine learning in Python

scikit-learn is a Python module for machine learning built on top of SciPy and is distributed under the 3-Clause BSD license. The project was started

scikit-learn 52.5k Jan 08, 2023
Official code for "InfoGraph: Unsupervised and Semi-supervised Graph-Level Representation Learning via Mutual Information Maximization" (ICLR 2020, spotlight)

InfoGraph: Unsupervised and Semi-supervised Graph-Level Representation Learning via Mutual Information Maximization Authors: Fan-yun Sun, Jordan Hoffm

Fan-Yun Sun 232 Dec 28, 2022
Generate indoor scenes with Transformers

SceneFormer: Indoor Scene Generation with Transformers Initial code release for the Sceneformer paper, contains models, train and test scripts for the

Chandan Yeshwanth 110 Dec 06, 2022
The easiest way to use deep metric learning in your application. Modular, flexible, and extensible. Written in PyTorch.

News December 27: v1.1.0 New loss functions: CentroidTripletLoss and VICRegLoss Mean reciprocal rank + per-class accuracies See the release notes Than

Kevin Musgrave 5k Jan 05, 2023
SlideGraph+: Whole Slide Image Level Graphs to Predict HER2 Status in Breast Cancer

SlideGraph+: Whole Slide Image Level Graphs to Predict HER2 Status in Breast Cancer A novel graph neural network (GNN) based model (termed SlideGraph+

28 Dec 24, 2022
MAterial del programa Misión TIC 2022

Mision TIC 2022 Esta iniciativa, aparece como respuesta frente a los retos de la Cuarta Revolución Industrial, y tiene como objetivo la formación de 1

6 May 25, 2022
CLIP2Video: Mastering Video-Text Retrieval via Image CLIP

CLIP2Video: Mastering Video-Text Retrieval via Image CLIP The implementation of paper CLIP2Video: Mastering Video-Text Retrieval via Image CLIP. CLIP2

168 Dec 29, 2022
Prototypical Cross-Attention Networks for Multiple Object Tracking and Segmentation, NeurIPS 2021 Spotlight

PCAN for Multiple Object Tracking and Segmentation This is the offical implementation of paper PCAN for MOTS. We also present a trailer that consists

ETH VIS Group 328 Dec 29, 2022
Source code for paper "ATP: AMRize Than Parse! Enhancing AMR Parsing with PseudoAMRs" @NAACL-2022

ATP: AMRize Then Parse! Enhancing AMR Parsing with PseudoAMRs Hi this is the source code of our paper "ATP: AMRize Then Parse! Enhancing AMR Parsing w

Chen Liang 13 Nov 23, 2022
BMN: Boundary-Matching Network

BMN: Boundary-Matching Network A pytorch-version implementation codes of paper: "BMN: Boundary-Matching Network for Temporal Action Proposal Generatio

qinxin 260 Dec 06, 2022