PyTorch Implementation of Vector Quantized Variational AutoEncoders.

Overview

Pytorch implementation of VQVAE.

This paper combines 2 tricks:

  1. Vector Quantization (check out this amazing blog for better understanding.)
  2. Straight-Through (It solves the problem of back-propagation through discrete latent variables, which are intractable.)

architecture

This model has a neural network encoder and decoder, and a prior just like the vanila Variational AutoEncoder(VAE). But this model also has a latent embedding space called codebook(size: K x D). Here, K is the size of latent space and D is the dimension of each embedding e.

In vanilla variational autoencoders, the output from the encoder z(x) is used to parameterize a Normal/Gaussian distribution, which is sampled from to get a latent representation z of the input x using the 'reparameterization trick'. This latent representation is then passed to the decoder. However, In VQVAEs, z(x) is used as a "key" to do nearest neighbour lookup into the embedding codebook c, and get zq(x), the closest embedding in the space. This is called Vector Quantization(VQ) operation. Then, zq(x) is passed to the decoder, which reconstructs the input x. The decoder can either parameterize p(x|z) as the mean of Normal distribution using a transposed convolution layer like in vannila VAE, or it can autoregressively generate categorical distribution over [0,255] pixel values like PixelCNN. In this project, the first approach is used.

The loss function is combined of 3 components:

  1. Regular Reconstruction loss
  2. Vector Quantization loss
  3. Commitment loss

Vector Quantization loss encourages the items in the codebook to move closer to the encoder output ||sg[ze(x) - e||^2] and Commitment loss encourages the output of the encoder to be close to embedding it picked, to commit to its codebook embedding. ||ze(x) - sg[e]]||^2 . commitment loss is multiplied with a constant beta, which is 1.0 for this project. Here, sg means "stop-gradient". Which means we don't propagate the gradients with respect to that term.

Results:

The Model is trained on MNIST and CIFAR10 datasets.

Target ๐Ÿ‘‰ Reconstructed Image


๐Ÿ‘‰

๐Ÿ‘‰

gif

Details:

  1. Trained models for MNIST and CIFAR10 are in the Trained models directory.
  2. Hidden size of the bottleneck(z) for MNIST and CIFAR10 is 128, 256 respectively.
Owner
Vrushank Changawala
Vrushank Changawala
Code for MSc Quantitative Finance Dissertation

MSc Dissertation Code ReadMe Sector Volatility Prediction Performance Using GARCH Models and Artificial Neural Networks Curtis Nybo MSc Quantitative F

2 Dec 01, 2022
PyTorch Implementation for Deep Metric Learning Pipelines

Easily Extendable Basic Deep Metric Learning Pipeline Karsten Roth ([email 

Karsten Roth 543 Jan 04, 2023
Rainbow DQN implementation that outperforms the paper's results on 40% of games using 20x less data ๐ŸŒˆ

Rainbow ๐ŸŒˆ An implementation of Rainbow DQN which reaches a median HNS of 205.7 after only 10M frames (the original Rainbow from Hessel et al. 2017 re

Dominik Schmidt 31 Dec 21, 2022
A library that allows for inference on probabilistic models

Bean Machine Overview Bean Machine is a probabilistic programming language for inference over statistical models written in the Python language using

Meta Research 234 Dec 29, 2022
We provided a matlab implementation for an evolutionary multitasking AUC optimization framework (EMTAUC).

EMTAUC We provided a matlab implementation for an evolutionary multitasking AUC optimization framework (EMTAUC). In this code, SBGA is considered a ba

7 Nov 24, 2022
Hierarchical probabilistic 3D U-Net, with attention mechanisms (โ€”๐˜ˆ๐˜ต๐˜ต๐˜ฆ๐˜ฏ๐˜ต๐˜ช๐˜ฐ๐˜ฏ ๐˜œ-๐˜•๐˜ฆ๐˜ต, ๐˜š๐˜Œ๐˜™๐˜ฆ๐˜ด๐˜•๐˜ฆ๐˜ต) and a nested decoder structure with deep supervision (โ€”๐˜œ๐˜•๐˜ฆ๐˜ต++).

Hierarchical probabilistic 3D U-Net, with attention mechanisms (โ€”๐˜ˆ๐˜ต๐˜ต๐˜ฆ๐˜ฏ๐˜ต๐˜ช๐˜ฐ๐˜ฏ ๐˜œ-๐˜•๐˜ฆ๐˜ต, ๐˜š๐˜Œ๐˜™๐˜ฆ๐˜ด๐˜•๐˜ฆ๐˜ต) and a nested decoder structure with deep supervision (โ€”๐˜œ๐˜•๐˜ฆ๐˜ต++). Built in TensorFlow 2.5. Configured for vox

Diagnostic Image Analysis Group 32 Dec 08, 2022
Tensorflow implementation of DeepLabv2

TF-deeplab This is a Tensorflow implementation of DeepLab, compatible with Tensorflow 1.2.1. Currently it supports both training and testing the ResNe

Chenxi Liu 21 Sep 27, 2022
Hyperparameter Optimization for TensorFlow, Keras and PyTorch

Hyperparameter Optimization for Keras Talos โ€ข Key Features โ€ข Examples โ€ข Install โ€ข Support โ€ข Docs โ€ข Issues โ€ข License โ€ข Download Talos radically changes

Autonomio 1.6k Dec 15, 2022
Pytorch implementation of MixNMatch

MixNMatch: Multifactor Disentanglement and Encoding for Conditional Image Generation [Paper] Yuheng Li, Krishna Kumar Singh, Utkarsh Ojha, Yong Jae Le

910 Dec 30, 2022
A benchmark dataset for emulating atmospheric radiative transfer in weather and climate models with machine learning (NeurIPS 2021 Datasets and Benchmarks Track)

ClimART - A Benchmark Dataset for Emulating Atmospheric Radiative Transfer in Weather and Climate Models Official PyTorch Implementation Using deep le

21 Dec 31, 2022
NICE-GAN โ€” Official PyTorch Implementation Reusing Discriminators for Encoding: Towards Unsupervised Image-to-Image Translation

NICE-GAN-pytorch - Official PyTorch implementation of NICE-GAN: Reusing Discriminators for Encoding: Towards Unsupervised Image-to-Image Translation

Runfa Chen 208 Nov 25, 2022
Real-time ground filtering algorithm of cloud points acquired using Terrestrial Laser Scanner (TLS)

This repository contains tools to simulate the ground filtering process of a registered point cloud. The repository contains two filtering methods. The first method uses a normal vector, and fit to p

5 Aug 25, 2022
The AWS Certified SysOps Administrator

The AWS Certified SysOps Administrator โ€“ Associate (SOA-C02) exam is intended for system administrators in a cloud operations role who have at least 1 year of hands-on experience with deployment, man

Aiden Pearce 32 Dec 11, 2022
A multi-functional library for full-stack Deep Learning. Simplifies Model Building, API development, and Model Deployment.

chitra What is chitra? chitra (เคšเคฟเคคเฅเคฐ) is a multi-functional library for full-stack Deep Learning. It simplifies Model Building, API development, and M

Aniket Maurya 210 Dec 21, 2022
Hierarchical Motion Encoder-Decoder Network for Trajectory Forecasting (HMNet)

Hierarchical Motion Encoder-Decoder Network for Trajectory Forecasting (HMNet) Our paper: https://arxiv.org/abs/2111.13324 We will release the complet

15 Oct 17, 2022
Videocaptioning.pytorch - A simple implementation of video captioning

pytorch implementation of video captioning recommend installing pytorch and pyth

Yiyu Wang 2 Jan 01, 2022
Equivariant layers for RC-complement symmetry in DNA sequence data

Equi-RC Equivariant layers for RC-complement symmetry in DNA sequence data This is a repository that implements the layers as described in "Reverse-Co

7 May 19, 2022
Official code for the paper: Deep Graph Matching under Quadratic Constraint (CVPR 2021)

QC-DGM This is the official PyTorch implementation and models for our CVPR 2021 paper: Deep Graph Matching under Quadratic Constraint. It also contain

Quankai Gao 55 Nov 14, 2022
A Unified Framework and Analysis for Structured Knowledge Grounding

UnifiedSKG ๐Ÿ“š : Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models Code for paper UnifiedSKG: Unifying and Mu

HKU NLP Group 370 Dec 21, 2022
ECCV18 Workshops - Enhanced SRGAN. Champion PIRM Challenge on Perceptual Super-Resolution. The training codes are in BasicSR.

ESRGAN (Enhanced SRGAN) [ ๐Ÿš€ BasicSR] [Real-ESRGAN] โœจ New Updates. We have extended ESRGAN to Real-ESRGAN, which is a more practical algorithm for rea

Xintao 4.7k Jan 02, 2023