Self-labelling via simultaneous clustering and representation learning. (ICLR 2020)

Overview

Self-labelling via simultaneous clustering and representation learning

🆗 🆗 🎉 NEW models (20th August 2020): Added standard SeLa pretrained torchvision ResNet models to make loading much easier + added baselines using better MoCov2 augmentation (~69% LP performance) + added evaluation with K=1000 for ImageNet "unuspervised clustering"

🆕 🎉 updated code: 23rd April 2020: bug fixes + CIFAR code + evaluation for resnet & alexnet.

Checkout our blogpost for a quick non-technical overview and an interactive visualization of our clusters.

Self-Label

This code is the official implementation of the ICLR 2020 paper Self-labelling via simultaneous clustering and representation learning.

Abstract

Combining clustering and representation learning is one of the most promising approaches for unsupervised learning of deep neural networks. However, doing so naively leads to ill posed learning problems with degenerate solutions. In this paper, we propose a novel and principled learning formulation that addresses these issues. The method is obtained by maximizing the information between labels and input data indices. We show that this criterion extends standard crossentropy minimization to an optimal transport problem, which we solve efficiently for millions of input images and thousands of labels using a fast variant of the Sinkhorn-Knopp algorithm. The resulting method is able to self-label visual data so as to train highly competitive image representations without manual labels. Our method achieves state of the art representation learning performance for AlexNet and ResNet-50 on SVHN, CIFAR-10, CIFAR-100 and ImageNet.

Results at a glance

NMI(%) aNMI(%) ARI(%) LP Acc (%)
AlexNet 1k 50.5 12.2 2.7 42.1
AlexNet 10k 66.4 4.7 4.7 43.8
R50 10x3k 54.2 34.4 7.2 61.5

With better augmentations (all single crop)

Label-Acc NMI(%) aNMI(%) ARI(%) LP Acc (%) model_weights
Aug++ R18 1k (new) 26.9 62.7 36.4 12.5 53.3 here
Aug++ R50 1k (new) 30.5 65.7 42.0 16.2 63.5 here
Aug++ R50 10x3k (new) 38.1 75.7 52.8 27.6 68.8 here
(MoCo-v2 + k-means**, K=3k) 71.4 39.6 15.8 71.1
  • "Aug++" refers to the better augmentations used in SimCLR, taken from the MoCo-v2 repo, but I still only trained for 280 epochs, with three lr-drops as in CMC.
  • There are still further improvements to be made with a MLP or training 800 epochs (I train 280), as done in SimCLR, MoCov2 and SwAV.
  • **MoCo-v2 uses 800 epochs, MLP and cos-lr-schedule. On MoCo-v2 I run k-means (K=3000) on the avg-pooled features (after the MLP-head it's pretty much the same performance) to obtain NMI, aNMI and ARI numbers.
  • Models above use standard torchvision ResNet backbones so loading is now super easy:
import torch, torchvision
model = torchvision.models.resnet50(pretrained=False, num_classes=3000)
ckpt = torch.load('resnet50-10x3k_pp.pth')
model.load_state_dict(ckpt['state_dict'])
pseudolabels = ckpt['L']
  • note on improvement potential: by just using "aug+": I get LP-accuracy of 67.2% after 200 epochs. MoCo-v2 with "aug+" only has 63.4% after 200 epochs.

Clusters that were discovered by our method

Sorted

Imagenet validation images with clusters sorted by imagenet purity

Random

Imagenet validation images with random clusters

The edge-colors encode the true imagenet classes (which are not used for training). You can view all clusters here.

Requirements

  • Python >3.6
  • PyTorch > 1.0
  • CUDA
  • Numpy, SciPy
  • also, see requirements.txt
  • (optional:) TensorboardX

Running our code

Run the self-supervised training of an AlexNet with the command

$./scripts/alexnet.sh

or train a ResNet-50 with

$./scripts/resnet.sh

Note: you need to specify your dataset directory (it expects a format just like ImageNet with "train" and "val" folders). You also need to give the code enough GPUs to allow for storage of activations on the GPU. Otherwise you need to use the CPU variant which is significantly slower.

Full documentation of the unsupervised training code main.py:

usage: main.py [-h] [--epochs EPOCHS] [--batch-size BATCH_SIZE] [--lr LR]
               [--lrdrop LRDROP] [--wd WD] [--dtype {f64,f32}] [--nopts NOPTS]
               [--augs AUGS] [--paugs PAUGS] [--lamb LAMB] [--cpu]
               [--arch ARCH] [--archspec {big,small}] [--ncl NCL] [--hc HC]
               [--device DEVICE] [--modeldevice MODELDEVICE] [--exp EXP]
               [--workers WORKERS] [--imagenet-path IMAGENET_PATH]
               [--comment COMMENT] [--log-intv LOG_INTV] [--log-iter LOG_ITER]

PyTorch Implementation of Self-Label

optional arguments:
  -h, --help            show this help message and exit
  --epochs EPOCHS       number of epochs
  --batch-size BATCH_SIZE
                        batch size (default: 256)
  --lr LR               initial learning rate (default: 0.05)
  --lrdrop LRDROP       multiply LR by 0.1 every (default: 150 epochs)
  --wd WD               weight decay pow (default: (-5)
  --dtype {f64,f32}     SK-algo dtype (default: f64)
  --nopts NOPTS         number of pseudo-opts (default: 100)
  --augs AUGS           augmentation level (default: 3)
  --paugs PAUGS         for pseudoopt: augmentation level (default: 3)
  --lamb LAMB           for pseudoopt: lambda (default:25)
  --cpu                 use CPU variant (slow) (default: off)
  --arch ARCH           alexnet or resnet (default: alexnet)
  --archspec {big,small}
                        alexnet variant (default:big)
  --ncl NCL             number of clusters per head (default: 3000)
  --hc HC               number of heads (default: 1)
  --device DEVICE       GPU devices to use for storage and model
  --modeldevice MODELDEVICE
                        GPU numbers on which the CNN runs
  --exp EXP             path to experiment directory
  --workers WORKERS     number workers (default: 6)
  --imagenet-path IMAGENET_PATH
                        path to folder that contains `train` and `val`
  --comment COMMENT     name for tensorboardX
  --log-intv LOG_INTV   save stuff every x epochs (default: 1)
  --log-iter LOG_ITER   log every x-th batch (default: 200)

Evaluation

Linear Evaluation

We provide the linear evaluation methods in this repo. Simply download the models via . ./scripts/download_models.sh and then either run scripts/eval-alexnet.sh or scripts/eval-resnet.sh.

Pascal VOC

We follow the standard evaluation protocols for self-supervised visual representation learning.

Our extracted pseudolabels

As we show in the paper, the pseudolabels we generate from our training can be used to quickly train a neural network with regular cross-entropy. Moreover they seem to correctly group together similar images. Hence we provide the labels for everyone to use.

AlexNet

You can download the pseudolabels from our best (raw) AlexNet model with 10x3000 clusters here.

ResNet

You can download the pseudolabels from our best ResNet model with 10x3000 clusters here.

Trained models

You can also download our trained models by running

$./scripts/download_models.sh

Use them like this:

import torch
import models
d = torch.load('self-label_models/resnet-10x3k.pth')
m = models.resnet(num_classes = [3000]*10)
m.load_state_dict(d)

d = torch.load('self-label_models/alexnet-10x3k-wRot.pth')
m = models.alexnet(num_classes = [3000]*10)
m.load_state_dict(d)

Reference

If you use this code etc., please cite the following paper:

Yuki M. Asano, Christian Rupprecht and Andrea Vedaldi. "Self-labelling via simultaneous clustering and representation learning." Proc. ICLR (2020)

@inproceedings{asano2020self,
  title={Self-labelling via simultaneous clustering and representation learning},
  author={Asano, Yuki M. and Rupprecht, Christian and Vedaldi, Andrea},
  booktitle={International Conference on Learning Representations (ICLR)},
  year={2020},
}
Owner
Yuki M. Asano
I'm an Computer Vision researcher at the University of Amsterdam. Did my PhD at the Visual Geometry Group in Oxford.
Yuki M. Asano
Official Repo for ICCV2021 Paper: Learning to Regress Bodies from Images using Differentiable Semantic Rendering

[ICCV2021] Learning to Regress Bodies from Images using Differentiable Semantic Rendering Getting Started DSR has been implemented and tested on Ubunt

Sai Kumar Dwivedi 83 Nov 27, 2022
Generate images from texts. In Russian. In PaddlePaddle

ruDALL-E PaddlePaddle ruDALL-E in PaddlePaddle. Install: pip install rudalle_paddle==0.0.1rc1 Run with free v100 on AI Studio. Original Pytorch versi

AgentMaker 20 Oct 18, 2022
A 2D Visual Localization Framework based on Essential Matrices [ICRA2020]

A 2D Visual Localization Framework based on Essential Matrices This repository provides implementation of our paper accepted at ICRA: To Learn or Not

Qunjie Zhou 27 Nov 07, 2022
This is a five-step framework for the development of intrusion detection systems (IDS) using machine learning (ML) considering model realization, and performance evaluation.

AB-TRAP: building invisibility shields to protect network devices The AB-TRAP framework is applicable to the development of Network Intrusion Detectio

Lab-C2DC - Laboratory of Command and Control and Cyber-security 17 Jan 04, 2023
Automatically download the cwru data set, and then divide it into training data set and test data set

Automatically download the cwru data set, and then divide it into training data set and test data set.自动下载cwru数据集,然后分训练数据集和测试数据集

6 Jun 27, 2022
Code artifacts for the submission "Mind the Gap! A Study on the Transferability of Virtual vs Physical-world Testing of Autonomous Driving Systems"

Code Artifacts Code artifacts for the submission "Mind the Gap! A Study on the Transferability of Virtual vs Physical-world Testing of Autonomous Driv

Andrea Stocco 2 Aug 24, 2022
Reimplementation of NeurIPS'19: "Meta-Weight-Net: Learning an Explicit Mapping For Sample Weighting" by Shu et al.

[Re] Meta-Weight-Net: Learning an Explicit Mapping For Sample Weighting Reimplementation of NeurIPS'19: "Meta-Weight-Net: Learning an Explicit Mapping

Robert Cedergren 1 Mar 13, 2020
Chess reinforcement learning by AlphaGo Zero methods.

About Chess reinforcement learning by AlphaGo Zero methods. This project is based on these main resources: DeepMind's Oct 19th publication: Mastering

Samuel 2k Dec 29, 2022
Scripts for training an AI to play the endless runner Subway Surfers using a supervised machine learning approach by imitation and a convolutional neural network (CNN) for image classification

About subwAI subwAI - a project for training an AI to play the endless runner Subway Surfers using a supervised machine learning approach by imitation

82 Jan 01, 2023
Bib-parser - Convenient script to parse .bib files with the ACM Digital Library like metadata

Bib Parser Convenient script to parse .bib files with the ACM Digital Library li

Mehtab Iqbal (Shahan) 1 Jan 26, 2022
PyTorch implementation of the implicit Q-learning algorithm (IQL)

Implicit-Q-Learning (IQL) PyTorch implementation of the implicit Q-learning algorithm IQL (Paper) Currently only implemented for online learning. Offl

Sebastian Dittert 27 Dec 30, 2022
Acoustic mosquito detection code with Bayesian Neural Networks

HumBugDB Acoustic mosquito detection with Bayesian Neural Networks. Extract audio or features from our large-scale dataset on Zenodo. This repository

31 Nov 28, 2022
Spherical Confidence Learning for Face Recognition, accepted to CVPR2021.

Sphere Confidence Face (SCF) This repository contains the PyTorch implementation of Sphere Confidence Face (SCF) proposed in the CVPR2021 paper: Shen

Maths 70 Dec 09, 2022
(ICCV 2021) ProHMR - Probabilistic Modeling for Human Mesh Recovery

ProHMR - Probabilistic Modeling for Human Mesh Recovery Code repository for the paper: Probabilistic Modeling for Human Mesh Recovery Nikos Kolotouros

Nikos Kolotouros 209 Dec 13, 2022
Lyapunov-guided Deep Reinforcement Learning for Stable Online Computation Offloading in Mobile-Edge Computing Networks

PyTorch code to reproduce LyDROO algorithm [1], which is an online computation offloading algorithm to maximize the network data processing capability subject to the long-term data queue stability an

Liang HUANG 87 Dec 28, 2022
Bayesian-Torch is a library of neural network layers and utilities extending the core of PyTorch to enable the user to perform stochastic variational inference in Bayesian deep neural networks

Bayesian-Torch is a library of neural network layers and utilities extending the core of PyTorch to enable the user to perform stochastic variational inference in Bayesian deep neural networks. Bayes

Intel Labs 210 Jan 04, 2023
Implementation of NÜWA, state of the art attention network for text to video synthesis, in Pytorch

NÜWA - Pytorch (wip) Implementation of NÜWA, state of the art attention network for text to video synthesis, in Pytorch. This repository will be popul

Phil Wang 463 Dec 28, 2022
EFENet: Reference-based Video Super-Resolution with Enhanced Flow Estimation

EFENet EFENet: Reference-based Video Super-Resolution with Enhanced Flow Estimation Code is a bit messy now. I woud clean up soon. For training the EF

Yaping Zhao 19 Nov 05, 2022
Read and write layered TIFF ImageSourceData and ImageResources tags

Read and write layered TIFF ImageSourceData and ImageResources tags Psdtags is a Python library to read and write the Adobe Photoshop(r) specific Imag

Christoph Gohlke 4 Feb 05, 2022
Rename Images with Auto Generated Neural Image Captions

Recaption Images with Generated Neural Image Caption Example Usage: Commandline: Recaption all images from folder /home/feng/Downloads/images to folde

feng wang 3 May 01, 2022