Official Implementation of CVPR 2022 paper: "Mimicking the Oracle: An Initial Phase Decorrelation Approach for Class Incremental Learning"

Overview

(CVPR 2022) Mimicking the Oracle: An Initial Phase Decorrelation Approach for Class Incremental Learning ArXiv

This repo contains Official Implementation of our CVPR 2022 paper: Mimicking the Oracle: An Initial Phase Decorrelation Approach for Class Incremental Learning.

1. Abstract

Class Incremental Learning (CIL) aims at learning a classifier in a phase-by-phase manner, in which only data of a subset of the classes are provided at each phase. Previous works mainly focus on mitigating forgetting in phases after the initial one. However, we find that improving CIL at its initial phase is also a promising direction. Specifically, we experimentally show that directly encouraging CIL Learner at the initial phase to output similar representations as the model jointly trained on all classes can greatly boost the CIL performance. Motivated by this, we study the difference between a na"ively-trained initial-phase model and the oracle model. Specifically, since one major difference between these two models is the number of training classes, we investigate how such difference affects the model representations. We find that, with fewer training classes, the data representations of each class lie in a long and narrow region; with more training classes, the representations of each class scatter more uniformly. Inspired by this observation, we propose Class-wise Decorrelation (CwD) that effectively regularizes representations of each class to scatter more uniformly, thus mimicking the model jointly trained with all classes (i.e., the oracle model). Our CwD is simple to implement and easy to plug into existing methods. Extensive experiments on various benchmark datasets show that CwD consistently and significantly improves the performance of existing state-of-the-art methods by around 1% to 3%.


2. Instructions to Run Our Code

Current codebase only contain experiments on LUCIR with CIFAR100 and ImageNet100. Code reproducing results based on PODNet and AANet are based on their repo and will be coming soon!


CIFAR100 Experiments w/ LUCIR

No need to download the datasets, everything will be dealt with automatically.

For LUCIR baseline, simply first navigate under "src" folder and run:

bash exp_cifar_lucir.sh

For LUCIR + CwD, first navigate under "src" folder and run:

bash exp_cifar_lucir_cwd.sh

ImageNet100 Experiments w/ LUCIR

To run ImageNet100, please follow the following two steps:

Step 1:

download and extract imagenet dataset under "src/data/imagenet" folder.

Then, under "src/data/imagenet", run:

python3 gen_lst.py

This command will generate two list that determine the order of classes for class incremental learning. The class order is shuffled by seed 1993 like most previous works.


Step 2:

For LUCIR baseline, first navigate under "src" folder and run:

bash exp_im100_lucir.sh

For LUCIR+CWD, first navigate under "src" folder and run:

bash exp_im100_lucir_cwd.sh

Some Comments on Running Scripts.

For "SEED" variable in the scripts, it is not the seed that used to shuffle the class order, it is the seed that determines model initialisation/data loader sampling, etc. We vary "SEED" from 0,1,2 and average the Average Incremental Accuracy to obtain results reported in the paper.


3. For customized usage

To use our CwD loss in your own project, simply copy and paste the CwD loss implemented in "src/approach/aux_loss.py" will be fine.


4. Citation

If you find our repo/paper helpful, please consider citing our work :)

@article{shi2021mimicking,
  title={Mimicking the Oracle: An Initial Phase Decorrelation Approach for Class Incremental Learning},
  author={Shi, Yujun and Zhou, Kuangqi and Liang, Jian and Jiang, Zihang and Feng, Jiashi and Torr, Philip and Bai, Song and Tan, Vincent YF},
  journal={arXiv preprint arXiv:2112.04731},
  year={2021}
}

5. Contact

Yujun Shi ([email protected])

6. Acknowledgements

Our code is based on FACIL, one of the most well-written CIL library in my opinion:)

7. Some Additional Remarks

Based on the original implementation of FACIL, I also implemented Distributed Data Parallel to enable multi-GPU training. However, it seems that the performance is not as good as single card training (about 0.5% lower). Therefore, in all experiments, I still use single card training.

Owner
Yujun Shi
PhD @ NUS
Yujun Shi
This repository is an official implementation of the paper MOTR: End-to-End Multiple-Object Tracking with TRansformer.

MOTR: End-to-End Multiple-Object Tracking with TRansformer This repository is an official implementation of the paper MOTR: End-to-End Multiple-Object

348 Jan 07, 2023
Efficient Conformer: Progressive Downsampling and Grouped Attention for Automatic Speech Recognition

Efficient Conformer: Progressive Downsampling and Grouped Attention for Automatic Speech Recognition Official implementation of the Efficient Conforme

Maxime Burchi 145 Dec 30, 2022
Hyperparameter tuning for humans

KerasTuner KerasTuner is an easy-to-use, scalable hyperparameter optimization framework that solves the pain points of hyperparameter search. Easily c

Keras 2.6k Dec 27, 2022
A set of Deep Reinforcement Learning Agents implemented in Tensorflow.

Deep Reinforcement Learning Agents This repository contains a collection of reinforcement learning algorithms written in Tensorflow. The ipython noteb

Arthur Juliani 2.2k Jan 01, 2023
GPU Programming with Julia - course at the Swiss National Supercomputing Centre (CSCS), ETH Zurich

Course Description The programming language Julia is being more and more adopted in High Performance Computing (HPC) due to its unique way to combine

Samuel Omlin 192 Jan 03, 2023
PyTorch implementation of our ICCV2021 paper: StructDepth: Leveraging the structural regularities for self-supervised indoor depth estimation

StructDepth PyTorch implementation of our ICCV2021 paper: StructDepth: Leveraging the structural regularities for self-supervised indoor depth estimat

SJTU-ViSYS 112 Nov 28, 2022
Official PyTorch Implementation of paper EAN: Event Adaptive Network for Efficient Action Recognition

Official PyTorch Implementation of paper EAN: Event Adaptive Network for Efficient Action Recognition

TianYuan 27 Nov 07, 2022
Pytorch Lightning Implementation of SC-Depth Methods.

SC_Depth_pl: This is a pytorch lightning implementation of SC-Depth (V1, V2) for self-supervised learning of monocular depth from video. In the V1 (IJ

JiaWang Bian 216 Dec 30, 2022
Syllabus del curso IIC2115 - Programación como Herramienta para la Ingeniería 2022/I

IIC2115 - Programación como Herramienta para la Ingeniería Videos y tutoriales Tutorial CMD Tutorial Instalación Python y Jupyter Tutorial de git-GitH

21 Nov 09, 2022
More than a hundred strange attractors

dysts Analyze more than a hundred chaotic systems. Basic Usage Import a model and run a simulation with default initial conditions and parameter value

William Gilpin 185 Dec 23, 2022
An LSTM for time-series classification

Update 10-April-2017 And now it works with Python3 and Tensorflow 1.1.0 Update 02-Jan-2017 I updated this repo. Now it works with Tensorflow 0.12. In

Rob Romijnders 391 Dec 27, 2022
TalkingHead-1KH is a talking-head dataset consisting of YouTube videos

TalkingHead-1KH Dataset TalkingHead-1KH is a talking-head dataset consisting of YouTube videos, originally created as a benchmark for face-vid2vid: On

173 Dec 29, 2022
This framework implements the data poisoning method found in the paper Adversarial Examples Make Strong Poisons

Adversarial poison generation and evaluation. This framework implements the data poisoning method found in the paper Adversarial Examples Make Strong

31 Nov 01, 2022
CountDown to New Year and shoot fireworks

CountDown and Shoot Fireworks About App This is an small application make you re

5 Dec 31, 2022
A Structured Self-attentive Sentence Embedding

Structured Self-attentive sentence embeddings Implementation for the paper A Structured Self-Attentive Sentence Embedding, which was published in ICLR

Kaushal Shetty 488 Nov 28, 2022
A PyTorch implementation of "SelfGNN: Self-supervised Graph Neural Networks without explicit negative sampling"

SelfGNN A PyTorch implementation of "SelfGNN: Self-supervised Graph Neural Networks without explicit negative sampling" paper, which will appear in Th

Zekarias Tilahun 24 Jun 21, 2022
MixText: Linguistically-Informed Interpolation of Hidden Space for Semi-Supervised Text Classification

MixText This repo contains codes for the following paper: Jiaao Chen, Zichao Yang, Diyi Yang: MixText: Linguistically-Informed Interpolation of Hidden

GT-SALT 309 Dec 12, 2022
Official codebase for Pretrained Transformers as Universal Computation Engines.

universal-computation Overview Official codebase for Pretrained Transformers as Universal Computation Engines. Contains demo notebook and scripts to r

Kevin Lu 210 Dec 28, 2022
Code for the CVPR 2021 paper: Understanding Failures of Deep Networks via Robust Feature Extraction

Welcome to Barlow Barlow is a tool for identifying the failure modes for a given neural network. To achieve this, Barlow first creates a group of imag

Sahil Singla 33 Dec 05, 2022
[ICLR'19] Trellis Networks for Sequence Modeling

TrellisNet for Sequence Modeling This repository contains the experiments done in paper Trellis Networks for Sequence Modeling by Shaojie Bai, J. Zico

CMU Locus Lab 460 Oct 13, 2022