MLP-Mixer-Pytorch

PyTorch implementation of MLP-Mixer: An all-MLP Architecture for Vision with the function of loading official ImageNet pre-trained parameters.

Usage

import torch
import numpy as np
from mlp_mixer import MlpMixer

pretrain_model='./pretrain_models/imagenet21k_Mixer-B_16.npz'

model = MlpMixer(num_classes=10, 
                 num_blocks=12, 
                 patch_size=16, 
                 hidden_dim=768, 
                 tokens_mlp_dim=384, 
                 channels_mlp_dim=3072, 
                 image_size=224
                 )

# load official ImageNet pre-trained model:
model.load_from(np.load(pretrain_model))
print ('Finish loading the pre-trained model!')

num_param = sum(p.numel() for p in model.parameters()) / 1e6
print ('Total params.: %f M'%num_param)

pred = model(img)

Fine-tuning

Download the official pre-trained models at https://console.cloud.google.com/storage/mixer_models/.

Hypyer-parameters setting for better fine-tuning:

optim = torch.optim.SGD(param_list, 
                        lr=5e-4, 
                        weight_decay=1e-7,
                        momentum=0.9, 
                        nesterov=True
                        )
lr_schdlr = WarmupCosineLrScheduler(optim, 
                                    n_iters_all, 
                                    warmup_iter=0
                                    )

Using the pre-trained model to fine-tune MLP-Mixer can obtain remarkable improvements (e.g., +10% accuracy on a small dataset).

Note that we can also change the patch_size (e.g., patch_size=8) for inputs with different resolutions, but smaller patch_size may not always bring performance improvements.

Citation

@misc{tolstikhin2021mlpmixer,
      title={MLP-Mixer: An all-MLP Architecture for Vision}, 
      author={Ilya Tolstikhin and Neil Houlsby and Alexander Kolesnikov and Lucas Beyer and Xiaohua Zhai and Thomas Unterthiner and Jessica Yung and Daniel Keysers and Jakob Uszkoreit and Mario Lucic and Alexey Dosovitskiy},
      year={2021},
      eprint={2105.01601},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Acknowledgement

The implementation is based on the original paper and the official Tensorflow repo: https://github.com/google-research/vision_transformer.
It also refers to the re-implementation repo: https://github.com/d-li14/mlp-mixer.pytorch.

Pytorch implementation of MLP-Mixer with loading pre-trained models.

Related tags

Overview

MLP-Mixer-Pytorch

Usage

Fine-tuning

Citation

Acknowledgement

Owner

Qiushi Yang

Mixed Neural Likelihood Estimation for models of decision-making

This is the PyTorch implementation of GANs N’ Roses: Stable, Controllable, Diverse Image to Image Translation

Official repository of OFA. Paper: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

paper: Hyperspectral Remote Sensing Image Classification Using Deep Convolutional Capsule Network

Python scripts form performing stereo depth estimation using the HITNET model in ONNX.

Multiple style transfer via variational autoencoder

A custom DeepStack model for detecting 16 human actions.

Kindle is an easy model build package for PyTorch.

PyTorch code for our paper "Image Super-Resolution with Non-Local Sparse Attention" (CVPR2021).

Source code for Acorn, the precision farming rover by Twisted Fields

Underwater image enhancement

Mesh TensorFlow: Model Parallelism Made Easier

HPRNet: Hierarchical Point Regression for Whole-Body Human Pose Estimation

Deep metric learning methods implemented in Chainer

Robust and Accurate Object Detection via Self-Knowledge Distillation

Unofficial Implement PU-Transformer

Codes and models for the paper "Learning Unknown from Correlations: Graph Neural Network for Inter-novel-protein Interaction Prediction".

Video Swin Transformer - PyTorch

[CVPR 2016] Unsupervised Feature Learning by Image Inpainting using GANs

This PyTorch package implements MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided Adaptation (NAACL 2022).