Clockwork Variational Autoencoders (CW-VAE)

Vaibhav Saxena, Jimmy Ba, Danijar Hafner

If you find this code useful, please reference in your paper:

@article{saxena2021clockworkvae,
  title={Clockwork Variational Autoencoders}, 
  author={Saxena, Vaibhav and Ba, Jimmy and Hafner, Danijar},
  journal={arXiv preprint arXiv:2102.09532},
  year={2021},
}

Method

Clockwork VAEs are deep generative model that learn long-term dependencies in video by leveraging hierarchies of representations that progress at different clock speeds. In contrast to prior video prediction methods that typically focus on predicting sharp but short sequences in the future, Clockwork VAEs can accurately predict high-level content, such as object positions and identities, for 1000 frames.

Clockwork VAEs build upon the Recurrent State Space Model (RSSM), so each state contains a deterministic component for long-term memory and a stochastic component for sampling diverse plausible futures. Clockwork VAEs are trained end-to-end to optimize the evidence lower bound (ELBO) that consists of a reconstruction term for each image and a KL regularizer for each stochastic variable in the model.

More information:

Instructions

This repository contains the code for training the Clockwork VAE model on the datasets minerl, mazes, and mmnist.

The datasets will automatically be downloaded into the --datadir directory.

python3 train.py --logdir /path/to/logdir --datadir /path/to/datasets --config configs/<dataset>.yml

The evaluation script writes open-loop video predictions in both PNG and NPZ format and plots of PSNR and SSIM to the data directory.

python3 eval.py --logdir /path/to/logdir

Clockwork Variational Autoencoder

Related tags

Overview

Clockwork Variational Autoencoders (CW-VAE)

Method

Instructions

Owner

Vaibhav Saxena

Official code for Score-Based Generative Modeling through Stochastic Differential Equations

Python/Rust implementations and notes from Proofs Arguments and Zero Knowledge

Multi-query Video Retreival

Jupyter notebooks for using & learning Keras

A whale detector design for the Kaggle whale-detector challenge!

Research on Tabular Deep Learning (Python package & papers)

Sound Event Detection with FilterAugment

QuickAI is a Python library that makes it extremely easy to experiment with state-of-the-art Machine Learning models.

Official PyTorch implementation of "Rapid Neural Architecture Search by Learning to Generate Graphs from Datasets" (ICLR 2021)

Space Ship Simulator using python

Image-retrieval-baseline - MUGE Multimodal Retrieval Baseline

Romanian Automatic Speech Recognition from the ROBIN project

Model Zoo for MindSpore

ArcaneGAN by Alex Spirin

Boundary IoU API (Beta version)

Live Hand Tracking Using Python

The repo for the paper "I3CL: Intra- and Inter-Instance Collaborative Learning for Arbitrary-shaped Scene Text Detection".

Must-read Papers on Physics-Informed Neural Networks.

CondLaneNet: a Top-to-down Lane Detection Framework Based on Conditional Convolution

Federated_learning codes used for the the paper "Evaluation of Federated Learning Aggregation Algorithms" and "A Federated Learning Aggregation Algorithm for Pervasive Computing: Evaluation and Comparison"