Implementation of momentum^2 teacher

Last update: Sep 26, 2022

Related tags

Deep Learning momentum2-teacher

Overview

Momentum^{^2} Teacher: Momentum Teacher with Momentum Statistics for Self-Supervised Learning

Requirements

All experiments are done with python3.6, torch==1.5.0; torchvision==0.6.0

Usage

Data Preparation

Prepare the ImageNet data in ${root_of_your_clone}/data/imagenet_train, ${root_of_your_clone}/data/imagenet_val. Since we have an internal platform(storage) to read imagenet, I have not tried the local mode. You may need to do some modification in momentum_teacher/data/dataset.py to support the local mode.

Training

Before training, ensure the path (namely ${root_of_clone}) is added in your PYTHONPATH, e.g.

export PYTHONPATH=$PYTHONPATH:${root_of_clone}

To do unsupervised pre-training of a ResNet-50 model on ImageNet in an 8-gpu machine, run:

using -d to specify gpu_id for training, e.g., -d 0-7
using -b to specify batch_size, e.g., -b 256
using --experiment-name to specify the output folder, and the training log & models will be dumped to './outputs/${experiment-name}'
using -f to specify the description file of ur experiment.

e.g.,

python3 momentum_teacher/tools/train.py -b 256 -d 0-7 --experiment-name your_exp -f momentum_teacher/exps/arxiv/exp_8_v100/momentum2_teacher_100e_exp.py

Linear Evaluation:

With a pre-trained model, to train a supervised linear classifier on frozen features/weights in an 8 gpus machine, run:

using -d to specify gpu_id for training, e.g., -d 0-7
using -b to specify batch_size, e.g., -b 256
using --experiment-name to specify the folder for saving pre-training models.

python3 momentum_teacher/tools/eval.py -b 256 --experiment-name your_exp -f momentum_teacher/exps/arxiv/linear_eval_exp_byol.py

Results

Results of Pretraining on a Single Machine

After pretraining on 8 NVIDIA V100 GPUS and 1024 batch-sizes, the results of linear-evaluation are:

pre-train code	pre-train epochs	pre-train time	accuracy	weights
path	100	~1.8 day	70.7	-
path	200	~3.6 day	72.7	-
path	300	~5.5 day	73.8	-

After pretraining on 8 NVIDIA 2080 GPUS and 256 batch-sizes, the results of linear-evaluation are:

pre-train code	pre-train epochs	pre-train time	accuracy	wights
path	100	~2.5 day	70.4	-
path	200	~5 day	72.3	-
path	300	~7.5 day	72.9	-

Results of Pretraining on Multiple Machines

E.g., To do unsupervised pre-training with 4096 batch-sizes and 32 V100 GPUs. run:

Suggesting that each machine has 8 V100 GPUs and there are 4 machines

# machine 1:
export MACHINE=0; export MACHINE_TOTAL=4; python3 momentum_teacher/tools/train.py -b 4096 -f xxx
# machine 2:
export MACHINE=1; export MACHINE_TOTAL=4; python3 momentum_teacher/tools/train.py -b 4096 -f xxx
# machine 3:
export MACHINE=2; export MACHINE_TOTAL=4; python3 momentum_teacher/tools/train.py -b 4096 -f xxx
# machine 4:
export MACHINE=3; export MACHINE_TOTAL=4; python3 momentum_teacher/tools/train.py -b 4096 -f xxx

results of linear-eval:

pre-train code	pre-train epochs	pre-train time	accuracy	weights
path	100	~11hour	70.3	-
path	200	~22hour	72.5	-
path	300	~33hour	73.7	-

To do unsupervised pre-training with 4096 batch-sizes and 128 2080 GPUs, pls follow the above guides. Results of linear-eval:

pre-train code	pre-train epochs	pre-train time	accuracy	weights
path	100	~5hour	69.0	-
path	200	~10hour	71.5	-
path	300	~15hour	72.3	-

Disclaimer

This is an implementation for Momentum^2 Teacher, it is worth noting that:

The original implementation is based on our internal Platform.
This released version has slightly better performances compared with the tech report's.

Implementation of momentum^2 teacher

Related tags

Overview

Momentum^{^2} Teacher: Momentum Teacher with Momentum Statistics for Self-Supervised Learning

Requirements

Usage

Data Preparation

Training

Linear Evaluation:

Results

Results of Pretraining on a Single Machine

Results of Pretraining on Multiple Machines

Disclaimer

Owner

jemmy li

TGS Salt Identification Challenge

Using fully convolutional networks for semantic segmentation with caffe for the cityscapes dataset

Minimal fastai code needed for working with pytorch

Pure python implementations of popular ML algorithms.

DumpSMBShare - A script to dump files and folders remotely from a Windows SMB share

Location-Sensitive Visual Recognition with Cross-IOU Loss

OpenDILab RL Kubernetes Custom Resource and Operator Lib

PROJECT - Az Residential Real Estate Analysis

Generating Videos with Scene Dynamics

Aalto-cs-msc-theses - Listing of M.Sc. Theses of the Department of Computer Science at Aalto University

This repository is a series of notebooks that show solutions for the projects at Dataquest.io.

A large-scale video dataset for the training and evaluation of 3D human pose estimation models

Joint Versus Independent Multiview Hashing for Cross-View Retrieval[J] (IEEE TCYB 2021, PyTorch Code)

Code for the Population-Based Bandits Algorithm, presented at NeurIPS 2020.

All-in-one Docker container that allows a user to explore Nautobot in a lab environment.

This is the official Pytorch implementation of the paper "Diverse Motion Stylization for Multiple Style Domains via Spatial-Temporal Graph-Based Generative Model"

Inferred Model-based Fuzzer

A clean implementation based on AlphaZero for any game in any framework + tutorial + Othello/Gobang/TicTacToe/Connect4 and more

Experiments on continual learning from a stream of pretrained models.

Progressive Coordinate Transforms for Monocular 3D Object Detection

Implementation of momentum^2 teacher

Related tags

Overview

Momentum^2 Teacher: Momentum Teacher with Momentum Statistics for Self-Supervised Learning

Requirements

Usage

Data Preparation

Training

Linear Evaluation:

Results

Results of Pretraining on a Single Machine

Results of Pretraining on Multiple Machines

Disclaimer

Owner

jemmy li

TGS Salt Identification Challenge

Using fully convolutional networks for semantic segmentation with caffe for the cityscapes dataset

Minimal fastai code needed for working with pytorch

Pure python implementations of popular ML algorithms.

DumpSMBShare - A script to dump files and folders remotely from a Windows SMB share

Location-Sensitive Visual Recognition with Cross-IOU Loss

OpenDILab RL Kubernetes Custom Resource and Operator Lib

PROJECT - Az Residential Real Estate Analysis

Generating Videos with Scene Dynamics

Aalto-cs-msc-theses - Listing of M.Sc. Theses of the Department of Computer Science at Aalto University

This repository is a series of notebooks that show solutions for the projects at Dataquest.io.

A large-scale video dataset for the training and evaluation of 3D human pose estimation models

Joint Versus Independent Multiview Hashing for Cross-View Retrieval[J] (IEEE TCYB 2021, PyTorch Code)

Code for the Population-Based Bandits Algorithm, presented at NeurIPS 2020.

All-in-one Docker container that allows a user to explore Nautobot in a lab environment.

This is the official Pytorch implementation of the paper "Diverse Motion Stylization for Multiple Style Domains via Spatial-Temporal Graph-Based Generative Model"

Inferred Model-based Fuzzer

A clean implementation based on AlphaZero for any game in any framework + tutorial + Othello/Gobang/TicTacToe/Connect4 and more

Experiments on continual learning from a stream of pretrained models.

Progressive Coordinate Transforms for Monocular 3D Object Detection

Momentum^{^2} Teacher: Momentum Teacher with Momentum Statistics for Self-Supervised Learning