Official implementation for NIPS'17 paper: PredRNN: Recurrent Neural Networks for Predictive Learning Using Spatiotemporal LSTMs.

Last update: Dec 26, 2022

Overview

PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive Learning

The predictive learning of spatiotemporal sequences aims to generate future images by learning from the historical context, where the visual dynamics are believed to have modular structures that can be learned with compositional subsystems.

First version at NeurIPS 2017

This repo first contains a PyTorch implementation of PredRNN (2017) [paper], a recurrent network with a pair of memory cells that operate in nearly independent transition manners, and finally form unified representations of the complex environment.

Concretely, besides the original memory cell of LSTM, this network is featured by a zigzag memory flow that propagates in both bottom-up and top-down directions across all layers, enabling the learned visual dynamics at different levels of RNNs to communicate.

New in PredRNN-V2 (2021)

This repo also includes the implementation of PredRNN-V2 (2021) [paper], which improves PredRNN in the following two aspects.

1. Memory Decoupling

We find that the pair of memory cells in PredRNN contain undesirable, redundant features, and thus present a memory decoupling loss to encourage them to learn modular structures of visual dynamics.

2. Reverse Scheduled Sampling

Reverse scheduled sampling is a new curriculum learning strategy for seq-to-seq RNNs. As opposed to scheduled sampling, it gradually changes the training process of the PredRNN encoder from using the previously generated frame to using the previous ground truth. Benefits: (1) It makes the training converge quickly by reducing the encoder-forecaster training gap. (2) It enforces the model to learn more from long-term input context.

Evaluation in LPIPS

LPIPS is more sensitive to perceptual human judgments, the lower the better.

	Moving MNIST	KTH action
PredRNN	0.109	0.204
PredRNN-V2	0.071	0.139

Prediction examples

Get Started

Install Python 3.7, PyTorch 1.3, and OpenCV 3.4.
Download data. This repo contains code for two datasets: the Moving Mnist dataset and the KTH action dataset.
Train the model. You can use the following bash script to train the model. The learned model will be saved in the --save_dir folder. The generated future frames will be saved in the --gen_frm_dir folder.
You can get pretrained models from here.

cd mnist_script/
sh predrnn_mnist_train.sh
sh predrnn_v2_mnist_train.sh

cd kth_script/
sh predrnn_kth_train.sh
sh predrnn_v2_kth_train.sh

Citation

If you find this repo useful, please cite the following papers.

@inproceedings{wang2017predrnn,
  title={{PredRNN}: Recurrent Neural Networks for Predictive Learning Using Spatiotemporal {LSTM}s},
  author={Wang, Yunbo and Long, Mingsheng and Wang, Jianmin and Gao, Zhifeng and Yu, Philip S},
  booktitle={Advances in Neural Information Processing Systems},
  pages={879--888},
  year={2017}
}

@misc{wang2021predrnn,
      title={{PredRNN}: A Recurrent Neural Network for Spatiotemporal Predictive Learning}, 
      author={Wang, Yunbo and Wu, Haixu and Zhang, Jianjin and Gao, Zhifeng and Wang, Jianmin and Yu, Philip S and Long, Mingsheng},
      year={2021},
      eprint={2103.09504},
      archivePrefix={arXiv},
}

Official implementation for NIPS'17 paper: PredRNN: Recurrent Neural Networks for Predictive Learning Using Spatiotemporal LSTMs.

Related tags

Overview

PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive Learning

First version at NeurIPS 2017

New in PredRNN-V2 (2021)

1. Memory Decoupling

2. Reverse Scheduled Sampling

Evaluation in LPIPS

Prediction examples

Get Started

Citation

Owner

THUML: Machine Learning Group @ THSS

Code repo for "RBSRICNN: Raw Burst Super-Resolution through Iterative Convolutional Neural Network" (Machine Learning and the Physical Sciences workshop in NeurIPS 2021).

Code for ICLR 2020 paper "VL-BERT: Pre-training of Generic Visual-Linguistic Representations".

A voice recognition assistant similar to amazon alexa, siri and google assistant.

Tensorflow implementation of our method: "Triangle Graph Interest Network for Click-through Rate Prediction".

Autonomous Robots Kalman Filters

Self-Learning - Books Papers, Courses & more I have to learn soon

Multispectral Object Detection with Yolov5

E-RAFT: Dense Optical Flow from Event Cameras

Implementation of "Semi-supervised Domain Adaptive Structure Learning"

Code for the paper "Offline Reinforcement Learning as One Big Sequence Modeling Problem"

This repository contains all the code and materials distributed in the 2021 Q-Programming Summer of Qode.

Using OpenAI's CLIP to upscale and enhance images

A study project using the AA-RMVSNet to reconstruct buildings from multiple images

Code for: Gradient-based Hierarchical Clustering using Continuous Representations of Trees in Hyperbolic Space. Nicholas Monath, Manzil Zaheer, Daniel Silva, Andrew McCallum, Amr Ahmed. KDD 2019.

The official repository for paper ''Domain Generalization for Vision-based Driving Trajectory Generation'' submitted to ICRA 2022

Contains modeling practice materials and homework for the Computational Neuroscience course at Okinawa Institute of Science and Technology

ICON: Implicit Clothed humans Obtained from Normals (CVPR 2022)

Implicit Deep Adaptive Design (iDAD)

This is a collection of all challenges in HKCERT CTF 2021

Implementation of Hire-MLP: Vision MLP via Hierarchical Rearrangement and An Image Patch is a Wave: Phase-Aware Vision MLP.