LeViT a Vision Transformer in ConvNet's Clothing for Faster Inference

Last update: Jan 02, 2023

Related tags

Overview

LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference

This repository contains PyTorch evaluation code, training code and pretrained models for LeViT.

They obtain competitive tradeoffs in terms of speed / precision:

For details see LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference by Benjamin Graham, Alaaeldin El-Nouby, Hugo Touvron, Pierre Stock, Armand Joulin, Hervé Jégou and Matthijs Douze.

If you use this code for a paper please cite:

@article{graham2021levit,
  title={LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference},
  author={Benjamin Graham and Alaaeldin El-Nouby and Hugo Touvron and Pierre Stock and Armand Joulin and Herv\'e J\'egou and Matthijs Douze},
  journal={arXiv preprint arXiv:22104.01136},
  year={2021}
}

Model Zoo

We provide baseline LeViT models trained with distllation on ImageNet 2012.

name	[email protected]	[email protected]	#FLOPs	#params	url
LeViT-128S	76.6	92.9	305M	7.8M	model
LeViT-128	78.6	94.0	406M	9.2M	model
LeViT-192	80.0	94.7	658M	11M	model
LeViT-256	81.6	95.4	1120M	19M	model
LeViT-384	82.6	96.0	2353M	39M	model

Usage

First, clone the repository locally:

git clone https://github.com/facebookresearch/levit.git

Then, install PyTorch 1.7.0+ and torchvision 0.8.1+ and pytorch-image-models:

conda install -c pytorch pytorch torchvision
pip install timm

Data preparation

Download and extract ImageNet train and val images from http://image-net.org/. The directory structure is the standard layout for the torchvision datasets.ImageFolder, and the training and validation data is expected to be in the train/ folder and val folder respectively:

/path/to/imagenet/
  train/
    class1/
      img1.jpeg
    class2/
      img2.jpeg
  val/
    class1/
      img3.jpeg
    class/2
      img4.jpeg

Evaluation

To evaluate a pre-trained LeViT-256 model on ImageNet val with a single GPU run:

python main.py --eval --model LeViT_256 --data-path /path/to/imagenet

This should give

* [email protected] 81.636 [email protected] 95.424 loss 0.750

Training

To train LeViT-256 on ImageNet with hard distillation on a single node with 8 gpus run:

python -m torch.distributed.launch --nproc_per_node=8 --use_env main.py --model LeViT_256 --data-path /path/to/imagenet --output_dir /path/to/save

Multinode training

Distributed training is available via Slurm and submitit:

pip install submitit

To train LeViT-256 model on ImageNet on one node with 8 gpus:

python run_with_submitit.py --model LeViT_256 --data-path /path/to/imagenet

License

This repository is released under the Apache 2.0 license as found in the LICENSE file.

Contributing

We actively welcome your pull requests! Please see CONTRIBUTING.md and CODE_OF_CONDUCT.md for more info.

LeViT a Vision Transformer in ConvNet's Clothing for Faster Inference

Related tags

Overview

LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference

Model Zoo

Usage

Data preparation

Evaluation

Training

Multinode training

License

Contributing

Owner

Facebook Research

Repository of 3D Object Detection with Pointformer (CVPR2021)

Lightweight library to build and train neural networks in Theano

atmaCup #11 の Public 4th / Pricvate 5th Solution のリポジトリです。

Real-time Joint Semantic Reasoning for Autonomous Driving

ByteTrack with ReID module following the paradigm of FairMOT, tracking strategy is borrowed from FairMOT/JDE.

Forecasting for knowable future events using Bayesian informative priors (forecasting with judgmental-adjustment).

Code for the submitted paper Surrogate-based cross-correlation for particle image velocimetry

Spam your friends and famly and when you do your famly will disown you and you will have no friends.

FADNet++: Real-Time and Accurate Disparity Estimation with Configurable Networks

Human annotated noisy labels for CIFAR-10 and CIFAR-100.

chainladder - Property and Casualty Loss Reserving in Python

Square Root Bundle Adjustment for Large-Scale Reconstruction

PyTorch implementation of Super SloMo by Jiang et al.

STARCH compuets regional extreme storm physical characteristics and moisture balance based on spatiotemporal precipitation data from reanalysis or climate model data.

🐾 Semantic segmentation of paws from cute pet images (PyTorch)

Very Deep Convolutional Networks for Large-Scale Image Recognition

High performance Cross-platform Inference-engine, you could run Anakin on x86-cpu,arm, nv-gpu, amd-gpu,bitmain and cambricon devices.

Shitty gaze mouse controller

Code repo for "RBSRICNN: Raw Burst Super-Resolution through Iterative Convolutional Neural Network" (Machine Learning and the Physical Sciences workshop in NeurIPS 2021).

Bayesian Meta-Learning Through Variational Gaussian Processes