Official codes: Self-Supervised Learning by Estimating Twin Class Distribution

Related tags

Deep LearningTWIST
Overview

TWIST: Self-Supervised Learning by Estimating Twin Class Distributions

Architecture

Codes and pretrained models for TWIST:

@article{wang2021self,
  title={Self-Supervised Learning by Estimating Twin Class Distributions},
  author={Wang, Feng and Kong, Tao and Zhang, Rufeng and Liu, Huaping and Li, Hang},
  journal={arXiv preprint arXiv:2110.07402},
  year={2021}
}

TWIST is a novel self-supervised representation learning method by classifying large-scale unlabeled datasets in an end-to-end way. We employ a siamese network terminated by a softmax operation to produce twin class distributions of two augmented images. Without supervision, we enforce the class distributions of different augmentations to be consistent. In the meantime, we regularize the class distributions to make them sharp and diverse. TWIST can naturally avoid the trivial solutions without specific designs such as asymmetric network, stop-gradient operation, or momentum encoder.

formula

Models and Results

Main Models for Representation Learning

arch params epochs linear download
Model with multi-crop and self-labeling
ResNet-50 24M 850 75.5% backbone only full ckpt args log eval logs
ResNet-50w2 94M 250 77.7% backbone only full ckpt args log eval logs
DeiT-S 21M 300 75.6% backbone only full ckpt args log eval logs
ViT-B 86M 300 77.3% backbone only full ckpt args log eval logs
Model without multi-crop and self-labeling
ResNet-50 24M 800 72.6% backbone only full ckpt args log eval logs

Model for unsupervised classification

arch params epochs NMI AMI ARI ACC download
ResNet-50 24M 800 74.4 57.7 30.1 40.5 backbone only full ckpt args log
Top-3 predictions for unsupervised classification

Top-3

Semi-Supervised Results

arch 1% labels 10% labels 100% labels
resnet-50 61.5% 71.7% 78.4%
resnet-50w2 67.2% 75.3% 80.3%

Detection Results

Task AP all AP 50 AP 75
VOC07+12 detection 58.1 84.2 65.4
COCO detection 41.9 62.6 45.7
COCO instance segmentation 37.9 59.7 40.6

Single-node Training

ResNet-50 (requires 8 GPUs, Top-1 Linear 72.6%)

python3 -m torch.distributed.launch --nproc_per_node=8 --use_env train.py \
  --data-path ${DATAPATH} \
  --output_dir ${OUTPUT} \
  --aug barlow \
  --batch-size 256 \
  --dim 32768 \
  --epochs 800 

Multi-node Training

ResNet-50 (requires 16 GPUs spliting over 2 nodes for multi-crop training, Top-1 Linear 75.5%)

python3 -m torch.distributed.launch --nproc_per_node=8 --use_env \
  --nnodes=${WORKER_NUM} \
  --node_rank=${MACHINE_ID} \
  --master_addr=${HOST} \
  --master_port=${PORT} train.py \
  --data-path ${DATAPATH} \
  --output_dir ${OUTPUT}

ResNet-50w2 (requires 32 GPUs spliting over 4 nodes for multi-crop training, Top-1 Linear 77.7%)

python3 -m torch.distributed.launch --nproc_per_node=8 --use_env \
  --nnodes=${WORKER_NUM} \
  --node_rank=${MACHINE_ID} \
  --master_addr=${HOST} \
  --master_port=${PORT} train.py \
  --data-path ${DATAPATH} \
  --output_dir ${OUTPUT} \
  --backbone 'resnet50w2' \
  --batch-size 60 \
  --bunch-size 240 \
  --epochs 250 \
  --mme_epochs 200 

DeiT-S (requires 16 GPUs spliting over 2 nodes for multi-crop training, Top-1 Linear 75.6%)

python3 -m torch.distributed.launch --nproc_per_node=8 --use_env \
  --nnodes=${WORKER_NUM} \
  --node_rank=${MACHINE_ID} \
  --master_addr=${HOST} \
  --master_port=${PORT} train.py \
  --data-path ${DATAPATH} \
  --output_dir ${OUTPUT} \
  --backbone 'vit_s' \
  --batch-size 128 \
  --bunch-size 256 \
  --clip_norm 3.0 \
  --epochs 300 \
  --mme_epochs 300 \
  --lam1 -0.6 \
  --lam2 1.0 \
  --local_crops_number 6 \
  --lr 0.0005 \
  --momentum_start 0.996 \
  --momentum_end 1.0 \
  --optim admw \
  --use_momentum_encoder 1 \
  --weight_decay 0.06 \
  --weight_decay_end 0.06 

ViT-B (requires 32 GPUs spliting over 4 nodes for multi-crop training, Top-1 Linear 77.3%)

python3 -m torch.distributed.launch --nproc_per_node=8 --use_env \
  --nnodes=${WORKER_NUM} \
  --node_rank=${MACHINE_ID} \
  --master_addr=${HOST} \
  --master_port=${PORT} train.py \
  --data-path ${DATAPATH} \
  --output_dir ${OUTPUT} \
  --backbone 'vit_b' \
  --batch-size 64 \
  --bunch-size 256 \
  --clip_norm 3.0 \
  --epochs 300 \
  --mme_epochs 300 \
  --lam1 -0.6 \
  --lam2 1.0 \
  --local_crops_number 6 \
  --lr 0.00075 \
  --momentum_start 0.996 \
  --momentum_end 1.0 \
  --optim admw \
  --use_momentum_encoder 1 \
  --weight_decay 0.06 \
  --weight_decay_end 0.06 

Linear Classification

For ResNet-50

python3 evaluate.py \
  ${DATAPATH} \
  ${OUTPUT}/checkpoint.pth \
  --weight-decay 0 \
  --checkpoint-dir ${OUTPUT}/linear_multihead/ \
  --batch-size 1024 \
  --val_epoch 1 \
  --lr-classifier 0.2

For DeiT-S

python3 -m torch.distributed.launch --nproc_per_node=8 evaluate_vitlinear.py \
  --arch vit_s \
  --pretrained_weights ${OUTPUT}/checkpoint.pth \
  --lr 0.02 \
  --data_path ${DATAPATH} \
  --output_dir ${OUTPUT} \

For ViT-B

python3 -m torch.distributed.launch --nproc_per_node=8 evaluate_vitlinear.py \
  --arch vit_b \
  --pretrained_weights ${OUTPUT}/checkpoint.pth \
  --lr 0.0015 \
  --data_path ${DATAPATH} \
  --output_dir ${OUTPUT} \

Semi-supervised Learning

Command for training semi-supervised classification

1% Percent (61.5%)

python3 evaluate.py ${DATAPATH} ${MODELPATH} \
  --weights finetune \
  --lr-backbone 0.04 \
  --lr-classifier 0.2 \
  --train-percent 1 \
  --weight-decay 0 \
  --epochs 20 \
  --backbone 'resnet50'

10% Percent (71.7%)

python3 evaluate.py ${DATAPATH} ${MODELPATH} \
  --weights finetune \
  --lr-backbone 0.02 \
  --lr-classifier 0.2 \
  --train-percent 10 \
  --weight-decay 0 \
  --epochs 20 \
  --backbone 'resnet50'

100% Percent (78.4%)

python3 evaluate.py ${DATAPATH} ${MODELPATH} \
  --weights finetune \
  --lr-backbone 0.01 \
  --lr-classifier 0.2 \
  --train-percent 100 \
  --weight-decay 0 \
  --epochs 30 \
  --backbone 'resnet50'

Detection

Instruction

  1. Install detectron2.

  2. Convert a pre-trained MoCo model to detectron2's format:

    python3 detection/convert-pretrain-to-detectron2.py ${MODELPATH} ${OUTPUTPKLPATH}
    
  3. Put dataset under "detection/datasets" directory, following the directory structure requried by detectron2.

  4. Training: VOC

    cd detection/
    python3 train_net.py \
      --config-file voc_fpn_1fc/pascal_voc_R_50_FPN_24k_infomin.yaml \
      --num-gpus 8 \
      MODEL.WEIGHTS ../${OUTPUTPKLPATH}
    

    COCO

    python3 train_net.py \
      --config-file infomin_configs/R_50_FPN_1x_infomin.yaml \
      --num-gpus 8 \
      MODEL.WEIGHTS ../${OUTPUTPKLPATH}
    
Owner
Bytedance Inc.
Bytedance Inc.
meProp: Sparsified Back Propagation for Accelerated Deep Learning

meProp The codes were used for the paper meProp: Sparsified Back Propagation for Accelerated Deep Learning with Reduced Overfitting (ICML 2017) [pdf]

LancoPKU 107 Nov 18, 2022
[NeurIPS'21] Shape As Points: A Differentiable Poisson Solver

Shape As Points (SAP) Paper | Project Page | Short Video (6 min) | Long Video (12 min) This repository contains the implementation of the paper: Shape

394 Dec 30, 2022
Code for the paper "Query Embedding on Hyper-relational Knowledge Graphs"

Query Embedding on Hyper-Relational Knowledge Graphs This repository contains the code used for the experiments in the paper Query Embedding on Hyper-

DimitrisAlivas 19 Jul 26, 2022
Official implementation of CATs: Cost Aggregation Transformers for Visual Correspondence NeurIPS'21

CATs: Cost Aggregation Transformers for Visual Correspondence NeurIPS'21 For more information, check out the paper on [arXiv]. Training with different

Sunghwan Hong 120 Jan 04, 2023
Hashformers is a framework for hashtag segmentation with transformers.

Hashtag segmentation is the task of automatically inserting the missing spaces between the words in a hashtag. Hashformers applies Transformer models

Ruan Chaves 41 Nov 09, 2022
Motion and Shape Capture from Sparse Markers

MoSh++ This repository contains the official chumpy implementation of mocap body solver used for AMASS: AMASS: Archive of Motion Capture as Surface Sh

Nima Ghorbani 135 Dec 23, 2022
In this work, we will implement some basic but important algorithm of machine learning step by step.

WoRkS continued English 中文 Français Probability Density Estimation-Non-Parametric Methods(概率密度估计-非参数方法) 1. Kernel / k-Nearest Neighborhood Density Est

liziyu0104 1 Dec 30, 2021
Annotate with anyone, anywhere.

h h is the web app that serves most of the https://hypothes.is/ website, including the web annotations API at https://hypothes.is/api/. The Hypothesis

Hypothesis 2.6k Jan 08, 2023
A platform for intelligent agent learning based on a 3D open-world FPS game developed by Inspir.AI.

Wilderness Scavenger: 3D Open-World FPS Game AI Challenge This is a platform for intelligent agent learning based on a 3D open-world FPS game develope

46 Nov 24, 2022
BossNAS: Exploring Hybrid CNN-transformers with Block-wisely Self-supervised Neural Architecture Search

BossNAS This repository contains PyTorch evaluation code, retraining code and pretrained models of our paper: BossNAS: Exploring Hybrid CNN-transforme

Changlin Li 127 Dec 26, 2022
The deployment framework aims to provide a simple, lightweight, fast integrated, pipelined deployment framework that ensures reliability, high concurrency and scalability of services.

savior是一个能够进行快速集成算法模块并支持高性能部署的轻量开发框架。能够帮助将团队进行快速想法验证(PoC),避免重复的去github上找模型然后复现模型;能够帮助团队将功能进行流程拆解,很方便的提高分布式执行效率;能够有效减少代码冗余,减少不必要负担。

Tao Luo 125 Dec 22, 2022
The implementation of FOLD-R++ algorithm

FOLD-R-PP The implementation of FOLD-R++ algorithm. The target of FOLD-R++ algorithm is to learn an answer set program for a classification task. Inst

13 Dec 23, 2022
Robust and Accurate Object Detection via Self-Knowledge Distillation

Robust and Accurate Object Detection via Self-Knowledge Distillation paper:https://arxiv.org/abs/2111.07239 Environments Python 3.7 Cuda 10.1 Prepare

Weipeng Xu 6 Jul 01, 2022
Code for paper entitled "Improving Novelty Detection using the Reconstructions of Nearest Neighbours"

NLN: Nearest-Latent-Neighbours A repository containing the implementation of the paper entitled Improving Novelty Detection using the Reconstructions

Michael (Misha) Mesarcik 4 Dec 14, 2022
Yolo ros - YOLO-ROS for HUAWEI ATLAS200

YOLO-ROS YOLO-ROS for NVIDIA YOLO-ROS for HUAWEI ATLAS200, please checkout for b

ChrisLiu 5 Oct 18, 2022
Predicts an answer in yes or no.

Oui-ou-non-prediction Predicts an answer in 'yes' or 'no'. It is based on the game 'effeuiller la marguerite' in which the person plucks flower petals

Ananya Gupta 1 Jan 15, 2022
Official implementation of the paper Chunked Autoregressive GAN for Conditional Waveform Synthesis

PyEmits, a python package for easy manipulation in time-series data. Time-series data is very common in real life. Engineering FSI industry (Financial

Descript 150 Dec 06, 2022
OpenIPDM is a MATLAB open-source platform that stands for infrastructures probabilistic deterioration model

Open-Source Toolbox for Infrastructures Probabilistic Deterioration Modelling OpenIPDM is a MATLAB open-source platform that stands for infrastructure

CIVML 0 Jan 20, 2022
Contrastive Language-Image Pretraining

CLIP [Blog] [Paper] [Model Card] [Colab] CLIP (Contrastive Language-Image Pre-Training) is a neural network trained on a variety of (image, text) pair

OpenAI 11.5k Jan 08, 2023
Investigating automatic navigation towards standard US views integrating MARL with the virtual US environment developed in CT2US simulation

AutomaticUSnavigation Investigating automatic navigation towards standard US views integrating MARL with the virtual US environment developed in CT2US

Cesare Magnetti 6 Dec 05, 2022