Main Results on ImageNet with Pretrained Models

Last update: Dec 14, 2022

Related tags

Overview

This repository contains Pytorch evaluation code, training code and pretrained models for the following projects:

SPACH (A Battle of Network Structures: An Empirical Study of CNN, Transformer, and MLP)
sMLP (Sparse MLP for Image Recognition: Is Self-Attention Really Necessary?)
ShiftViT (When Shift Operation Meets Vision Transformer: An Extremely Simple Alternative to Attention Mechanism)

Main Results on ImageNet with Pretrained Models

name	[email protected]	#params	FLOPs	url
SPACH-Conv-MS-S	81.6	44M	7.2G	github
SPACH-Trans-MS-S	82.9	40M	7.6G	github
SPACH-MLP-MS-S	82.1	46M	8.2G	github
SPACH-Hybrid-MS-S	83.7	63M	11.2G	github
SPACH-Hybrid-MS-S+	83.9	63M	12.3G	github
sMLPNet-T	81.9	24M	5.0G
sMLPNet-S	83.1	49M	10.3G	github
sMLPNet-B	83.4	66M	14.0G	github
Shift-T / light	79.4	20M	3.0G	github
Shift-T	81.7	29M	4.5G	github
Shift-S / light	81.6	34M	5.7G	github
Shift-S	82.8	50M	8.8G	github

Usage

Install

First, clone the repo and install requirements:

git clone https://github.com/microsoft/Spach
pip install -r requirements.txt

Data preparation

Download and extract ImageNet train and val images from http://image-net.org/. The directory structure is the standard layout for the torchvision datasets.ImageFolder, and the training and validation data is expected to be in the train/ folder and val/ folder respectively:

/path/to/imagenet/
  train/
    class1/
      img1.jpeg
    class2/
      img2.jpeg
  val/
    class1/
      img3.jpeg
    class/2
      img4.jpeg

Evaluation

To evaluate a pre-trained model on ImageNet val with a single GPU run:

python main.py --eval --resume <checkpoint> --model <model-name>--data-path <imagenet-path>

For example, to evaluate the SPACH-Hybrid-MS-S model, run

python main.py --eval --resume --model spach_ms_s_patch4_224_hybrid spach_ms_hybrid_s.pth --data-path <imagenet-path>

giving

* [email protected] 83.658 [email protected] 96.762 loss 0.688

You can find all supported models in models/registry.py.

Training

One can simply call the following script to run training process. Distributed training is recommended even on single GPU node.

python -m torch.distributed.launch --nproc_per_node <num-of-gpus-to-use> --use_env main.py \
--model <model-name>
--data-path <imagenet-path>
--output_dir <output-path>
--dist-eval

Citation

@article{zhao2021battle,
  title={A Battle of Network Structures: An Empirical Study of CNN, Transformer, and MLP},
  author={Zhao, Yucheng and Wang, Guangting and Tang, Chuanxin and Luo, Chong and Zeng, Wenjun and Zha, Zheng-Jun},
  journal={arXiv preprint arXiv:2108.13002},
  year={2021}
}

@article{tang2021sparse,
  title={Sparse MLP for Image Recognition: Is Self-Attention Really Necessary?},
  author={Tang, Chuanxin and Zhao, Yucheng and Wang, Guangting and Luo, Chong and Xie, Wenxuan and Zeng, Wenjun},
  journal={arXiv preprint arXiv:2109.05422},
  year={2021}
}

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

Acknowledgement

Our code are built on top of DeiT. We test throughput following Swin Transformer

You might also like...

Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch

1.4k Jan 1, 2023

Measuring and Improving Consistency in Pretrained Language Models

ParaRel 🤘 This repository contains the code and data for the paper: Measuring and Improving Consistency in Pretrained Language Models as well as the

26 Dec 2, 2022

Reference implementation of code generation projects from Facebook AI Research. General toolkit to apply machine learning to code, from dataset creation to model training and evaluation. Comes with pretrained models.

This repository is a toolkit to do machine learning for programming languages. It implements tokenization, dataset preprocessing, model training and m

408 Jan 1, 2023

A library for finding knowledge neurons in pretrained transformer models.

knowledge-neurons An open source repository replicating the 2021 paper Knowledge Neurons in Pretrained Transformers by Dai et al., and extending the t

96 Dec 21, 2022

VisualGPT: Data-efficient Adaptation of Pretrained Language Models for Image Captioning

VisualGPT Our Paper VisualGPT: Data-efficient Adaptation of Pretrained Language Models for Image Captioning Main Architecture of Our VisualGPT Downloa

140 Dec 28, 2022

YOLOv5 🚀 is a family of object detection architectures and models pretrained on the COCO dataset

YOLOv5 🚀 is a family of object detection architectures and models pretrained on the COCO dataset, and represents Ultralytics open-source research int

73 Dec 16, 2022

Music Source Separation; Train & Eval & Inference piplines and pretrained models we used for 2021 ISMIR MDX Challenge.

Music Source Separation with Channel-wise Subband Phase Aware ResUnet (CWS-PResUNet) Introduction This repo contains the pretrained Music Source Separ

100 Dec 25, 2022

Base pretrained models and datasets in pytorch (MNIST, SVHN, CIFAR10, CIFAR100, STL10, AlexNet, VGG16, VGG19, ResNet, Inception, SqueezeNet)

This is a playground for pytorch beginners, which contains predefined models on popular dataset. Currently we support mnist, svhn cifar10, cifar100 st

2.4k Dec 28, 2022

Implementation of Squeezenet in pytorch, pretrained models on Cifar 10 data to come

Pytorch Squeeznet Pytorch implementation of Squeezenet model as described in https://arxiv.org/abs/1602.07360 on cifar-10 Data. The definition of Sque

86 Oct 28, 2022

Comments

Shift features implementation

Hi, very interesting research. I wonder why did you implement the shift_feature as memory copy https://github.com/microsoft/SPACH/blob/497c1d86fffd9d48e26c0484fb845ff04c328cca/models/shiftvit.py#L107 instead of using Tensor.roll operation? It would make your block much faster. Another benefit would be that pixels from one side would leak to the other giving the network to pass information from one boundary to another, which seems a better option that dublication of the last row during each shift.

opened by bonlime 3
Add: unofficial implementation

Hey folks,

It would be great if this repository could also hold links for other unofficial implementations. I am proposing a keras tutorial on ShiftViT.

opened by ariG23498 0
The configuration of the architecture variants is inconsistent with the papers and weights files.

@tangchuanxin

https://github.com/microsoft/SPACH/blob/497c1d86fffd9d48e26c0484fb845ff04c328cca/models/registry.py#L224

The code is inconsistent with the content of the paper:

and the weight file. The content of this pth file is the same as the architecture variant -S in the figure above, ie, depths=(6, 8, 18, 6).

https://github.com/microsoft/SPACH/releases/download/v1.0/shiftvit_tiny_r2.pth

opened by lartpang 1

Main Results on ImageNet with Pretrained Models

Related tags

Overview

Main Results on ImageNet with Pretrained Models

Usage

Install

Data preparation

Evaluation

Training

Citation

Contributing

Acknowledgement

You might also like...

Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch

Measuring and Improving Consistency in Pretrained Language Models

Reference implementation of code generation projects from Facebook AI Research. General toolkit to apply machine learning to code, from dataset creation to model training and evaluation. Comes with pretrained models.

A library for finding knowledge neurons in pretrained transformer models.

VisualGPT: Data-efficient Adaptation of Pretrained Language Models for Image Captioning

YOLOv5 🚀 is a family of object detection architectures and models pretrained on the COCO dataset

Music Source Separation; Train & Eval & Inference piplines and pretrained models we used for 2021 ISMIR MDX Challenge.

Base pretrained models and datasets in pytorch (MNIST, SVHN, CIFAR10, CIFAR100, STL10, AlexNet, VGG16, VGG19, ResNet, Inception, SqueezeNet)

Implementation of Squeezenet in pytorch, pretrained models on Cifar 10 data to come

Comments

Shift features implementation

Add: unofficial implementation

The configuration of the architecture variants is inconsistent with the papers and weights files.

Releases(v1.0)

v1.0(Nov 19, 2021)

Owner

Microsoft

A deep learning network built with TensorFlow and Keras to classify gender and estimate age.

Sequential Model-based Algorithm Configuration

[ICCV'21] Official implementation for the paper Social NCE: Contrastive Learning of Socially-aware Motion Representations

Deep learning toolbox based on PyTorch for hyperspectral data classification.

Tensorflow implementation of "Learning Deep Features for Discriminative Localization"

This program automatically runs Python code copied in clipboard

A setup script to generate ITK Python Wheels

[ECCV'20] Convolutional Occupancy Networks

Playable Video Generation

Part-Aware Data Augmentation for 3D Object Detection in Point Cloud

Json2Xml tool will help you convert from json COCO format to VOC xml format in Object Detection Problem.

Detect roadway lanes using Python OpenCV for project during the 5th semester at DHBW Stuttgart for lecture in digital image processing.

KSAI Lite is a deep learning inference framework of kingsoft, based on tensorflow lite

Code used to generate the results appearing in "Train longer, generalize better: closing the generalization gap in large batch training of neural networks"

Approaches to modeling terrain and maps in python

The open-source and free to use Python package miseval was developed to establish a standardized medical image segmentation evaluation procedure

Material related to the Principles of Cloud Computing course.

Voice Conversion Using Speech-to-Speech Neuro-Style Transfer

Companion repository to the paper accepted at the 4th ACM SIGSPATIAL International Workshop on Advances in Resilient and Intelligent Cities