Pytorch implementation of Masked Auto-Encoder

Related tags

Deep LearningMAE-code
Overview

Masked Auto-Encoder (MAE)

Pytorch implementation of Masked Auto-Encoder:

Usage

  1. Clone to the local.
> git clone https://github.com/liujiyuan13/MAE-code.git MAE-code
  1. Install required packages.
> cd MAE-code
> pip install requirements.txt
  1. Prepare datasets.
  • For Cifar10, Cifar100 and STL, skip this step for it will be done automatically;
  • For ImageNet1K, download and unzip the train(val) set into ./data/ImageNet1K/train(val).
  1. Set parameters.
  • All parameters are kept in default_args() function of main_mae(eval).py file.
  1. Run the code.
> python main_mae.py	# train MAE encoder
> python main_eval.py	# evaluate MAE encoder
  1. Visualize the ouput.
> tensorboard --logdir=./log --port 8888

Detail

Project structure

...
+ ckpt				# checkpoint
+ data 				# data folder
+ img 				# store images for README.md
+ log 				# log files
.gitignore 			
lars.py 			# LARS optimizer
main_eval.py 			# main file for evaluation
main_mae.py  			# main file for MAE training
model.py 			# model definitions of MAE and EvalNet
README.md 
util.py 			# helper functions
vit.py 				# definition of vision transformer

Encoder setting

In the paper, ViT-Base, ViT-Large and ViT-Huge are used. You can switch between them by simply changing the parameters in default_args(). Details can be found here and are listed in following table.

Name Layer Num. Hidden Size MLP Size Head Num.
Arg vit_depth vit_dim vit_mlp_dim vit_heads
ViT-B 12 768 3072 12
ViT-L 24 1024 4096 16
ViT-H 32 1280 5120 16

Evaluation setting

I implement four network training strategies concerned in the paper, including

  • pre-training is used to train MAE encoder and done in main_mae.py.
  • linear probing is used to evaluate MAE encoder. During training, MAE encoder is fixed.
    • args.n_partial = 0
  • partial fine-tuning is used to evaluate MAE encoder. During training, MAE encoder is partially fixed.
    • args.n_partial = 0.5 --> fine-tuning MLP sub-block with the transformer fixed
    • 1<=args.n_partial<=args.vit_depth-1 --> fine-tuning MLP sub-block and last layers of transformer
  • end-to-end fine-tuning is used to evaluate MAE encoder. During training, MAE encoder is fully trainable.
    • args.n_partial = args.vit_depth

Note that the last three strategies are done in main_eval.py where parameter args.n_partial is located.

At the same time, I follow the parameter settings in the paper appendix. Note that partial fine-tuning and end-to-end fine-tuning use the same setting. Nevertheless, I replace RandAug(9, 0.5) with RandomResizedCrop and leave mixup, cutmix and drop path techniques in further implementation.

Result

The experiment reproduce will takes a long time and I am unfortunately busy these days. If you get some results and are willing to contribute, please reach me via email. Thanks!

By the way, I have run the code from start to end. It works! So don't worry about the implementation errors. If you find any, please raise issues or email me.

Licence

This repository is under GPL V3.

About

Thanks project vit-pytorch, pytorch-lars and DeepLearningExamples for their codes contribute to this repository a lot!

Homepage: https://liujiyuan13.github.io

Email: [email protected]

Owner
Jiyuan
Jiyuan
CCPD: a diverse and well-annotated dataset for license plate detection and recognition

CCPD (Chinese City Parking Dataset, ECCV) UPdate on 10/03/2019. CCPD Dataset is now updated. We are confident that images in subsets of CCPD is much m

detectRecog 1.8k Dec 30, 2022
A curated list of awesome projects and resources related fastai

A curated list of awesome projects and resources related fastai

Tanishq Abraham 138 Dec 22, 2022
OpenFace – a state-of-the art tool intended for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation.

OpenFace 2.2.0: a facial behavior analysis toolkit Over the past few years, there has been an increased interest in automatic facial behavior analysis

Tadas Baltrusaitis 5.8k Dec 31, 2022
High-Resolution Image Synthesis with Latent Diffusion Models

Latent Diffusion Models Requirements A suitable conda environment named ldm can be created and activated with: conda env create -f environment.yaml co

CompVis Heidelberg 5.6k Jan 04, 2023
fklearn: Functional Machine Learning

fklearn: Functional Machine Learning fklearn uses functional programming principles to make it easier to solve real problems with Machine Learning. Th

nubank 1.4k Dec 07, 2022
'Solving the sampling problem of the Sycamore quantum supremacy circuits

solve_sycamore This repo contains data, contraction code, and contraction order for the paper ''Solving the sampling problem of the Sycamore quantum s

Feng Pan 29 Nov 28, 2022
Object Detection using YOLO from PyImageSearch

Object Detection using YOLO from PyImageSearch By applying object detection, you’ll not only be able to determine what is in an image, but also where

Mohamed NIANG 1 Feb 09, 2022
When Does Pretraining Help? Assessing Self-Supervised Learning for Law and the CaseHOLD Dataset of 53,000+ Legal Holdings

When Does Pretraining Help? Assessing Self-Supervised Learning for Law and the CaseHOLD Dataset of 53,000+ Legal Holdings This is the repository for t

RegLab 39 Jan 07, 2023
PyTorch 1.5 implementation for paper DECOR-GAN: 3D Shape Detailization by Conditional Refinement.

DECOR-GAN PyTorch 1.5 implementation for paper DECOR-GAN: 3D Shape Detailization by Conditional Refinement, Zhiqin Chen, Vladimir G. Kim, Matthew Fish

Zhiqin Chen 72 Dec 31, 2022
Implementation for HFGI: High-Fidelity GAN Inversion for Image Attribute Editing

HFGI: High-Fidelity GAN Inversion for Image Attribute Editing High-Fidelity GAN Inversion for Image Attribute Editing Update: We released the inferenc

Tengfei Wang 371 Dec 30, 2022
Deep-Learning-Book-Chapter-Summaries - Attempting to make the Deep Learning Book easier to understand.

Deep-Learning-Book-Chapter-Summaries This repository provides a summary for each chapter of the Deep Learning book by Ian Goodfellow, Yoshua Bengio an

Aman Dalmia 1k Dec 27, 2022
ML models implementation practice

Let's implement various ML algorithms with numpy/tf Vanilla Neural Network https://towardsdatascience.com/lets-code-a-neural-network-in-plain-numpy-ae

Jinsoo Heo 4 Jul 04, 2021
Awesome Long-Tailed Learning

Awesome Long-Tailed Learning This repo pays specially attention to the long-tailed distribution, where labels follow a long-tailed or power-law distri

Stomach_ache 284 Jan 06, 2023
Software & Hardware to do multi color printing with Sharpies

3D Print Colorizer is a combination of 3D printed parts and a Cura plugin which allows anyone with an Ender 3 like 3D printer to produce multi colored

343 Jan 06, 2023
CoReD: Generalizing Fake Media Detection with Continual Representation using Distillation (ACMMM'21 Oral Paper)

CoReD: Generalizing Fake Media Detection with Continual Representation using Distillation (ACMMM'21 Oral Paper) (Accepted for oral presentation at ACM

Minha Kim 1 Nov 12, 2021
[BMVC 2021] Official PyTorch Implementation of Self-supervised learning of Image Scale and Orientation Estimation

Self-Supervised Learning of Image Scale and Orientation Estimation (BMVC 2021) This is the official implementation of the paper "Self-Supervised Learn

Jongmin Lee 17 Nov 10, 2022
Pytorch implementation of Hinton's Dynamic Routing Between Capsules

pytorch-capsule A Pytorch implementation of Hinton's "Dynamic Routing Between Capsules". https://arxiv.org/pdf/1710.09829.pdf Thanks to @naturomics fo

Tim Omernick 625 Oct 27, 2022
Official PyTorch implementation of "Improving Face Recognition with Large AgeGaps by Learning to Distinguish Children" (BMVC 2021)

Inter-Prototype (BMVC 2021): Official Project Webpage This repository provides the official PyTorch implementation of the following paper: Improving F

Jungsoo Lee 16 Jun 30, 2022
Image processing in Python

scikit-image: Image processing in Python Website (including documentation): https://scikit-image.org/ Mailing list: https://mail.python.org/mailman3/l

Image Processing Toolbox for SciPy 5.2k Dec 31, 2022
Using pytorch to implement unet network for liver image segmentation.

Using pytorch to implement unet network for liver image segmentation.

zxq 1 Dec 17, 2021