Pytorch implementation of Masked Auto-Encoder

Related tags

Deep LearningMAE-code
Overview

Masked Auto-Encoder (MAE)

Pytorch implementation of Masked Auto-Encoder:

Usage

  1. Clone to the local.
> git clone https://github.com/liujiyuan13/MAE-code.git MAE-code
  1. Install required packages.
> cd MAE-code
> pip install requirements.txt
  1. Prepare datasets.
  • For Cifar10, Cifar100 and STL, skip this step for it will be done automatically;
  • For ImageNet1K, download and unzip the train(val) set into ./data/ImageNet1K/train(val).
  1. Set parameters.
  • All parameters are kept in default_args() function of main_mae(eval).py file.
  1. Run the code.
> python main_mae.py	# train MAE encoder
> python main_eval.py	# evaluate MAE encoder
  1. Visualize the ouput.
> tensorboard --logdir=./log --port 8888

Detail

Project structure

...
+ ckpt				# checkpoint
+ data 				# data folder
+ img 				# store images for README.md
+ log 				# log files
.gitignore 			
lars.py 			# LARS optimizer
main_eval.py 			# main file for evaluation
main_mae.py  			# main file for MAE training
model.py 			# model definitions of MAE and EvalNet
README.md 
util.py 			# helper functions
vit.py 				# definition of vision transformer

Encoder setting

In the paper, ViT-Base, ViT-Large and ViT-Huge are used. You can switch between them by simply changing the parameters in default_args(). Details can be found here and are listed in following table.

Name Layer Num. Hidden Size MLP Size Head Num.
Arg vit_depth vit_dim vit_mlp_dim vit_heads
ViT-B 12 768 3072 12
ViT-L 24 1024 4096 16
ViT-H 32 1280 5120 16

Evaluation setting

I implement four network training strategies concerned in the paper, including

  • pre-training is used to train MAE encoder and done in main_mae.py.
  • linear probing is used to evaluate MAE encoder. During training, MAE encoder is fixed.
    • args.n_partial = 0
  • partial fine-tuning is used to evaluate MAE encoder. During training, MAE encoder is partially fixed.
    • args.n_partial = 0.5 --> fine-tuning MLP sub-block with the transformer fixed
    • 1<=args.n_partial<=args.vit_depth-1 --> fine-tuning MLP sub-block and last layers of transformer
  • end-to-end fine-tuning is used to evaluate MAE encoder. During training, MAE encoder is fully trainable.
    • args.n_partial = args.vit_depth

Note that the last three strategies are done in main_eval.py where parameter args.n_partial is located.

At the same time, I follow the parameter settings in the paper appendix. Note that partial fine-tuning and end-to-end fine-tuning use the same setting. Nevertheless, I replace RandAug(9, 0.5) with RandomResizedCrop and leave mixup, cutmix and drop path techniques in further implementation.

Result

The experiment reproduce will takes a long time and I am unfortunately busy these days. If you get some results and are willing to contribute, please reach me via email. Thanks!

By the way, I have run the code from start to end. It works! So don't worry about the implementation errors. If you find any, please raise issues or email me.

Licence

This repository is under GPL V3.

About

Thanks project vit-pytorch, pytorch-lars and DeepLearningExamples for their codes contribute to this repository a lot!

Homepage: https://liujiyuan13.github.io

Email: [email protected]

Owner
Jiyuan
Jiyuan
Code for Fully Context-Aware Image Inpainting with a Learned Semantic Pyramid

SPN: Fully Context-Aware Image Inpainting with a Learned Semantic Pyramid Code for Fully Context-Aware Image Inpainting with a Learned Semantic Pyrami

12 Jun 27, 2022
Forest R-CNN: Large-Vocabulary Long-Tailed Object Detection and Instance Segmentation (ACM MM 2020)

Forest R-CNN: Large-Vocabulary Long-Tailed Object Detection and Instance Segmentation (ACM MM 2020) Official implementation of: Forest R-CNN: Large-Vo

Jialian Wu 54 Jan 06, 2023
Intel® Neural Compressor is an open-source Python library running on Intel CPUs and GPUs

Intel® Neural Compressor targeting to provide unified APIs for network compression technologies, such as low precision quantization, sparsity, pruning, knowledge distillation, across different deep l

Intel Corporation 846 Jan 04, 2023
Code accompanying paper: Meta-Learning to Improve Pre-Training

Meta-Learning to Improve Pre-Training This folder contains code to run experiments in the paper Meta-Learning to Improve Pre-Training, NeurIPS 2021. P

28 Dec 31, 2022
This repository accompanies our paper “Do Prompt-Based Models Really Understand the Meaning of Their Prompts?”

This repository accompanies our paper “Do Prompt-Based Models Really Understand the Meaning of Their Prompts?” Usage To replicate our results in Secti

Albert Webson 64 Dec 11, 2022
PyTorch implementation for our paper "Deep Facial Synthesis: A New Challenge"

FSGAN Here is the official PyTorch implementation for our paper "Deep Facial Synthesis: A New Challenge". This project achieve the translation between

Deng-Ping Fan 32 Oct 10, 2022
Self-Supervised Pillar Motion Learning for Autonomous Driving (CVPR 2021)

Self-Supervised Pillar Motion Learning for Autonomous Driving Chenxu Luo, Xiaodong Yang, Alan Yuille Self-Supervised Pillar Motion Learning for Autono

QCraft 101 Dec 05, 2022
Single Image Random Dot Stereogram for Tensorflow

TensorFlow-SIRDS Single Image Random Dot Stereogram for Tensorflow SIRDS is a means to present 3D data in a 2D image. It allows for scientific data di

Greg Peatfield 5 Aug 10, 2022
3DIAS: 3D Shape Reconstruction with Implicit Algebraic Surfaces (ICCV 2021)

3DIAS_Pytorch This repository contains the official code to reproduce the results from the paper: 3DIAS: 3D Shape Reconstruction with Implicit Algebra

Mohsen Yavartanoo 21 Dec 12, 2022
Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data

Real-ESRGAN Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data Ported from https://github.com/xinntao/Real-ESRGAN Depend

Holy Wu 44 Dec 27, 2022
Generate fine-tuning samples & Fine-tuning the model & Generate samples by transferring Note On

UPMT Generate fine-tuning samples & Fine-tuning the model & Generate samples by transferring Note On See main.py as an example: from model import PopM

7 Sep 01, 2022
SMCA replication There are no extra compiled components in SMCA DETR and package dependencies are minimal

Usage There are no extra compiled components in SMCA DETR and package dependencies are minimal, so the code is very simple to use. We provide instruct

22 May 06, 2022
Repositorio oficial del curso IIC2233 Programación Avanzada 🚀✨

IIC2233 - Programación Avanzada Evaluación Las evaluaciones serán efectuadas por medio de actividades prácticas en clases y tareas. Se calculará la no

IIC2233 @ UC 47 Sep 06, 2022
Complex Answer Generation For Conversational Search Systems.

Complex Answer Generation For Conversational Search Systems. Code for Does Structure Matter? Leveraging Data-to-Text Generation for Answering Complex

Hanane Djeddal 0 Dec 06, 2021
Code for the AAAI 2022 paper "Zero-Shot Cross-Lingual Machine Reading Comprehension via Inter-Sentence Dependency Graph".

multilingual-mrc-isdg Code for the AAAI 2022 paper "Zero-Shot Cross-Lingual Machine Reading Comprehension via Inter-Sentence Dependency Graph". This r

Liyan 5 Dec 07, 2022
Autonomous racing with the Anki Overdrive

Anki Autonomous Racing Autonomous racing with the Anki Overdrive. Using the Overdrive-Python API (https://github.com/xerodotc/overdrive-python) develo

3 Dec 11, 2022
[NeurIPS'21] "AugMax: Adversarial Composition of Random Augmentations for Robust Training" by Haotao Wang, Chaowei Xiao, Jean Kossaifi, Zhiding Yu, Animashree Anandkumar, and Zhangyang Wang.

[NeurIPS'21] "AugMax: Adversarial Composition of Random Augmentations for Robust Training" by Haotao Wang, Chaowei Xiao, Jean Kossaifi, Zhiding Yu, Animashree Anandkumar, and Zhangyang Wang.

VITA 112 Nov 07, 2022
FPGA: Fast Patch-Free Global Learning Framework for Fully End-to-End Hyperspectral Image Classification

FPGA & FreeNet Fast Patch-Free Global Learning Framework for Fully End-to-End Hyperspectral Image Classification by Zhuo Zheng, Yanfei Zhong, Ailong M

Zhuo Zheng 92 Jan 03, 2023
PyTorch inference for "Progressive Growing of GANs" with CelebA snapshot

Progressive Growing of GANs inference in PyTorch with CelebA training snapshot Description This is an inference sample written in PyTorch of the origi

320 Nov 21, 2022
Transfer Learning library for Deep Neural Networks.

Transfer and meta-learning in Python Each folder in this repository corresponds to a method or tool for transfer/meta-learning. xfer-ml is a standalon

Amazon 245 Dec 08, 2022