A simple, unofficial implementation of MAE using pytorch-lightning

Last update: Dec 03, 2022

Related tags

Deep Learning mae-pytorch

Overview

Masked Autoencoders in PyTorch

A simple, unofficial implementation of MAE (Masked Autoencoders are Scalable Vision Learners) using pytorch-lightning.

Currently implements training on CUB and StanfordCars, but is easily extensible to any other image dataset.

Setup

.env">

# Clone the repository
git clone https://github.com/catalys1/mae-pytorch.git
cd mae-pytorch

# Install required libraries (inside a virtual environment preferably)
pip install -r requirements.txt

# Set up .env for path to data
echo "DATADIR=/path/to/data" > .env

Usage

MAE training

Training options are provided through configuration files, handled by LightningCLI. See configs/ for examples.

Train an MAE model on the CUB dataset:

python train.py fit --config=configs/mae.yaml --config=configs/data/cub_mae.yaml

Using multiple GPUs:

python train.py fit --config=configs/mae.yaml --config=configs/data/cub_mae.yaml --config=configs/multigpu.yaml

Fine-tuning

Not yet implemented.

Implementation

The default model uses ViT-Base for the encoder, and a small ViT (depth=4, width=192) for the decoder. This is smaller than the model used in the paper.

Dependencies

Configuration and training is handled completely by pytorch-lightning.
The MAE model uses the VisionTransformer from timm.
Interface to FGVC datasets through fgvcdata.
Configurable environment variables through python-dotenv.

Results

Image reconstructions of CUB validation set images after training with the following command:

python train.py fit --config=configs/mae.yaml --config=configs/data/cub_mae.yaml --config=configs/multigpu.yaml

A simple, unofficial implementation of MAE using pytorch-lightning

Related tags

Overview

Masked Autoencoders in PyTorch

Setup

Usage

MAE training

Fine-tuning

Implementation

Dependencies

Results

Owner

Connor Anderson

DeepVoxels is an object-specific, persistent 3D feature embedding.

This repo generates the training data and the model for Morpheus-Deblend

Classifying cat and dog images using Kaggle dataset

🧠 A PyTorch implementation of 'Deep CORAL: Correlation Alignment for Deep Domain Adaptation.', ECCV 2016

[CVPR 2022 Oral] Balanced MSE for Imbalanced Visual Regression https://arxiv.org/abs/2203.16427

Numba-accelerated Pythonic implementation of MPDATA with examples in Python, Julia and Matlab

DP-CL(Continual Learning with Differential Privacy)

banditml is a lightweight contextual bandit & reinforcement learning library designed to be used in production Python services.

A repository for the paper "Improved Adversarial Systems for 3D Object Generation and Reconstruction".

ICCV2021 Expert-Goal Trajectory Prediction

Open-AI's DALL-E for large scale training in mesh-tensorflow.

sssegmentation is a general framework for our research on strongly supervised semantic segmentation.

Unofficial TensorFlow implementation of the Keyword Spotting Transformer model

[ICCV 2021] Amplitude-Phase Recombination: Rethinking Robustness of Convolutional Neural Networks in Frequency Domain

Implementing a simplified copy of Shazam application from scratch using MinHashing and LSH.

Synthetic Humans for Action Recognition, IJCV 2021

NeuroFind - A solution to the to the Task given by the Oberseminar of Messtechnik Institute of TU Dresden in 2021

Official implementation of the paper "AAVAE: Augmentation-AugmentedVariational Autoencoders"

RARA: Zero-shot Sim2Real Visual Navigation with Following Foreground Cues

QuadTree Attention for Vision Transformers (ICLR2022)