MaskGIT-pytorch

Pytorch implementation of MaskGIT: Masked Generative Image Transformer (https://arxiv.org/pdf/2202.04200.pdf)

Note: this is work in progress

MaskGIT is an extension to the VQGAN paper which improves the second stage transformer part (and leaves the first stage untouched). It switches the unidirectional transformer for a bidirectional transformer. The (second stage) training is pretty similar to BERT by randomly masking out tokens and trying to predict these using the bidirectional transformer (the original work used a GPT architecture randomly replaced tokens by other tokens). Different from BERT, the percentage for the masking is not fixed and uniformly distributed between 0 and 1 for each batch. Furhtermore, a new inference algorithm is suggested in which we start off by a completely masked-out image and then iteratively sample vectors where the model has a high confidence.

If you are only interested in the part of the code that comes from this paper check out transformer.py.

Run the code

The code is ready for training both the VQGAN and the Bidirectional Transformer and can also be used for inference

python training_vqgan.py

python training_transformer.py

(Make sure to edit the path for the dataset etc.)

TODO

Implement the gamma functions
Implement functions for image editing tasks: inpainting, extrapolation, image manipulation
Tune hyperparameters
(Provide visual results)

Pytorch implementation of MaskGIT: Masked Generative Image Transformer

Related tags

Overview

MaskGIT-pytorch

Note: this is work in progress

Run the code

TODO

Owner

Dominic Rampas

DROPO: Sim-to-Real Transfer with Offline Domain Randomization

Code for the paper: Fighting Fake News: Image Splice Detection via Learned Self-Consistency

PyTorch implementation for the Neuro-Symbolic Sudoku Solver leveraging the power of Neural Logic Machines (NLM)

Distributed DataLoader For Pytorch Based On Ray

Dynamica causal Bayesian optimisation

A collection of resources and papers on Diffusion Models, a darkhorse in the field of Generative Models

This is a computer vision based implementation of the popular childhood game 'Hand Cricket/Odd or Even' in python

High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.

Official Code for VideoLT: Large-scale Long-tailed Video Recognition (ICCV 2021)

Job-Recommend-Competition - Vectorwise Interpretable Attentions for Multimodal Tabular Data

Boundary IoU API (Beta version)

A Python library for adversarial machine learning focusing on benchmarking adversarial robustness.

Learning Calibrated-Guidance for Object Detection in Aerial Images

CRNN With PyTorch

PyTorch implementation of: Michieli U. and Zanuttigh P., "Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations", CVPR 2021.

Implementations of polygamma, lgamma, and beta functions for PyTorch

Dataset para entrenamiento de yoloV3 para 4 clases

a practicable framework used in Deep Learning. So far UDL only provide DCFNet implementation for the ICCV paper (Dynamic Cross Feature Fusion for Remote Sensing Pansharpening)

HODEmu, is both an executable and a python library that is based on Ragagnin 2021 in prep.

Pytorch cuda extension of grid_sample1d