CLOOB training (JAX) and inference (JAX and PyTorch)

Last update: Nov 27, 2022

Related tags

Overview

cloob-training

Pretrained models

There are two pretrained CLOOB models in this repo at the moment, a 16 epoch and a 32 epoch ViT-B/16 checkpoint trained on LAION 400M.

Zero-shot ImageNet validation set accuracy (using OpenCLIP's code):

Model name	Top 1	Top 5
cloob_laion_400m_vit_b_16_16_epochs	0.61238	0.8492
cloob_laion_400m_vit_b_16_32_epochs	0.62816	0.85964
OpenAI CLIP ViT-B/32	0.6327	0.88772
OpenAI CLIP ViT-B/16	0.68132	0.91768
OpenAI CLIP ViT-L/14	0.75388	0.9454
OpenAI CLIP ViT-L/14 @ 336 px	0.76564	0.9515
OpenAI CLIP RN50	0.59806	0.86498
OpenAI CLIP RN101	0.62296	0.88106
OpenAI CLIP RN50x4	0.66268	0.9046
OpenAI CLIP RN50x16	0.70754	0.92822
OpenAI CLIP RN50x64	0.74134	0.94146

PyTorch

from cloob_training import model_pt, pretrained

pretrained.list_configs()

returns:

['cloob_laion_400m_vit_b_16_16_epochs', 'cloob_laion_400m_vit_b_16_32_epochs']

The models can be used by:

config = pretrained.get_config('cloob_laion_400m_vit_b_16_16_epochs')
model = model_pt.get_pt_model(config)
checkpoint = pretrained.download_checkpoint(config)
model.load_state_dict(model_pt.get_pt_params(config, checkpoint))
model.eval().requires_grad_(False).to('cuda')

Model class attributes:

model.config: the model config dict.

model.image_encoder: the image encoder, which expects NCHW batches of normalized images (preprocessed by model.normalize), where C = model.config['image_encoder']['input_channels'] and H, W = model.config['image_encoder']['image_size'].

model.text_encoder: the text encoder, which expects text tokenized by model.tokenize.

model.normalize: the preprocessor for image tensors.

model.tokenize: the preprocessor for text.

JAX

Coming soon...

Training (JAX only)

Coming soon...

CLOOB training (JAX) and inference (JAX and PyTorch)

Related tags

Overview

cloob-training

Pretrained models

PyTorch

JAX

Training (JAX only)

Owner

Katherine Crowson

This repo contains research materials released by members of the Google Brain team in Tokyo.

PyTorch implementation of the paper: Label Noise Transition Matrix Estimation for Tasks with Lower-Quality Features

Distributed Arcface Training in Pytorch

《A-CNN: Annularly Convolutional Neural Networks on Point Clouds》(2019)

Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr\"om Method (NeurIPS 2021)

Quickly comparing your image classification models with the state-of-the-art models (such as DenseNet, ResNet, ...)

On the Limits of Pseudo Ground Truth in Visual Camera Re-Localization

Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation, CVPR 2018

StyleGAN-Human: A Data-Centric Odyssey of Human Generation

Author's PyTorch implementation of TD3 for OpenAI gym tasks

QueryInst: Parallelly Supervised Mask Query for Instance Segmentation

Large Scale Multi-Illuminant (LSMI) Dataset for Developing White Balance Algorithm under Mixed Illumination

Double pendulum simulator using a symplectic Euler's method and Hamiltonian mechanics

Implementation of OpenAI paper with Simple Noise Scale on Fastai V2

Learning Optical Flow from a Few Matches (CVPR 2021)

It is a system used to detect bone fractures. using techniques deep learning and image processing

A reimplementation of DCGAN in PyTorch

Neural network for digit classification powered by cuda

Demo code for ICCV 2021 paper "Sensor-Guided Optical Flow"

Using pretrained GROVER to extract the atomic fingerprints from molecule