Official implementation of DreamerPro: Reconstruction-Free Model-Based Reinforcement Learning with Prototypical Representations in TensorFlow 2

Overview

DreamerPro

Official implementation of DreamerPro: Reconstruction-Free Model-Based Reinforcement Learning with Prototypical Representations in TensorFlow 2. A re-implementation of Temporal Predictive Coding for Model-Based Planning in Latent Space is also included.

DreamerPro makes large performance gains on the DeepMind Control suite both in the standard setting and when there are complex background distractions. This is achieved by combining Dreamer with prototypical representations that free the world model from reconstructing visual details.

Setup

Dependencies

First clone the repository, and then set up a conda environment with all required dependencies using the requirements.txt file:

git clone https://github.com/fdeng18/dreamer-pro.git
cd dreamer-pro
conda create --name dreamer-pro python=3.8 conda-forge::cudatoolkit conda-forge::cudnn
conda activate dreamer-pro
pip install --upgrade pip
pip install -r requirements.txt

DreamerPro has not been tested on Atari, but if you would like to try, the Atari ROMs can be imported by following these instructions.

Natural background videos

Our natural background setting follows TPC. For convenience, we have included their code to download the background videos. Simply run:

python download_videos.py

This will download the background videos into kinetics400/videos.

Training

DreamerPro

For standard DMC, run:

cd DreamerPro
python dreamerv2/train.py --logdir log/dmc_{task}/dreamer_pro/{run} --task dmc_{task} --configs defaults dmc norm_off

Here, {task} should be replaced by the actual task, and {run} should be assigned an integer indicating the independent runs of the same model on the same task. For example, to start the first run on walker_run:

cd DreamerPro
python dreamerv2/train.py --logdir log/dmc_walker_run/dreamer_pro/1 --task dmc_walker_run --configs defaults dmc norm_off

For natural background DMC, run:

cd DreamerPro
python dreamerv2/train.py --logdir log/nat_{task}/dreamer_pro/{run} --task nat_{task} --configs defaults dmc reward_1000

TPC

DreamerPro is based on a newer version of Dreamer. For fair comparison, we re-implement TPC based on the same version. Our re-implementation obtains better results in the natural background setting than reported in the original TPC paper.

For standard DMC, run:

cd TPC
python dreamerv2/train.py --logdir log/dmc_{task}/tpc/{run} --task dmc_{task} --configs defaults dmc

For natural background DMC, run:

cd TPC
python dreamerv2/train.py --logdir log/nat_{task}/tpc/{run} --task nat_{task} --configs defaults dmc reward_1000

Dreamer

For standard DMC, run:

cd Dreamer
python dreamerv2/train.py --logdir log/dmc_{task}/dreamer/{run} --task dmc_{task} --configs defaults dmc

For natural background DMC, run:

cd Dreamer
python dreamerv2/train.py --logdir log/nat_{task}/dreamer/{run} --task nat_{task} --configs defaults dmc reward_1000 --precision 32

We find it necessary to use --precision 32 in the natural background setting for numerical stability.

Outputs

The training process can be monitored via TensorBoard. We have also included performance curves in plots. Note that these curves may appear different from what is shown in TensorBoard. This is because the evaluation return in the performance curves is averaged over 10 episodes, while TensorBoard only shows the evaluation return of the last episode.

Acknowledgments

This repository is largely based on the TensorFlow 2 implementation of Dreamer. We would like to thank Danijar Hafner for releasing and updating his clean implementation. In addition, we also greatly appreciate the help from Tung Nguyen in implementing TPC.

Single-step adversarial training (AT) has received wide attention as it proved to be both efficient and robust.

Subspace Adversarial Training Single-step adversarial training (AT) has received wide attention as it proved to be both efficient and robust. However,

15 Sep 02, 2022
Unofficial pytorch implementation of the paper "Dynamic High-Pass Filtering and Multi-Spectral Attention for Image Super-Resolution"

DFSA Unofficial pytorch implementation of the ICCV 2021 paper "Dynamic High-Pass Filtering and Multi-Spectral Attention for Image Super-Resolution" (p

2 Nov 15, 2021
Structure Information is the Key: Self-Attention RoI Feature Extractor in 3D Object Detection

Structure Information is the Key: Self-Attention RoI Feature Extractor in 3D Object Detection abstract:Unlike 2D object detection where all RoI featur

DK. Zhang 2 Oct 07, 2022
Code for the submitted paper Surrogate-based cross-correlation for particle image velocimetry

Surrogate-based cross-correlation (SBCC) This repository contains code for the submitted paper Surrogate-based cross-correlation for particle image ve

5 Jun 30, 2022
Trading Gym is an open source project for the development of reinforcement learning algorithms in the context of trading.

Trading Gym Trading Gym is an open-source project for the development of reinforcement learning algorithms in the context of trading. It is currently

Dimitry Foures 535 Nov 15, 2022
🐦 Quickly annotate data from the comfort of your Jupyter notebook

🐦 pigeon - Quickly annotate data on Jupyter Pigeon is a simple widget that lets you quickly annotate a dataset of unlabeled examples from the comfort

Anastasis Germanidis 647 Jan 05, 2023
Repository for the paper "Online Domain Adaptation for Occupancy Mapping", RSS 2020

RSS 2020 - Online Domain Adaptation for Occupancy Mapping Repository for the paper "Online Domain Adaptation for Occupancy Mapping", Robotics: Science

Anthony 26 Sep 22, 2022
Pytorch implementation of XRD spectral identification from COD database

XRDidentifier Pytorch implementation of XRD spectral identification from COD database. Details will be explained in the paper to be submitted to NeurI

Masaki Adachi 4 Jan 07, 2023
TraSw for FairMOT - A Single-Target Attack example (Attack ID: 19; Screener ID: 24):

TraSw for FairMOT A Single-Target Attack example (Attack ID: 19; Screener ID: 24): Fig.1 Original Fig.2 Attacked By perturbing only two frames in this

Derry Lin 21 Dec 21, 2022
This is an early in-development version of training CLIP models with hivemind.

A transformer that does not hog your GPU memory This is an early in-development codebase: if you want a stable and documented hivemind codebase, look

<a href=[email protected]"> 4 Nov 06, 2022
End-to-end speech secognition toolkit

End-to-end speech secognition toolkit This is an E2E ASR toolkit modified from Espnet1 (version 0.9.9). This is the official implementation of paper:

Jinchuan Tian 147 Dec 28, 2022
PyTorch version repo for CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes

Study-CSRNet-pytorch This is the PyTorch version repo for CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes

0 Mar 01, 2022
Implementing SYNTHESIZER: Rethinking Self-Attention in Transformer Models using Pytorch

Implementing SYNTHESIZER: Rethinking Self-Attention in Transformer Models using Pytorch Reference Paper URL Author: Yi Tay, Dara Bahri, Donald Metzler

Myeongjun Kim 66 Nov 30, 2022
Reading list for research topics in Masked Image Modeling

awesome-MIM Reading list for research topics in Masked Image Modeling(MIM). We list the most popular methods for MIM, if I missed something, please su

ligang 231 Dec 07, 2022
Hamiltonian Dynamics with Non-Newtonian Momentum for Rapid Sampling

Hamiltonian Dynamics with Non-Newtonian Momentum for Rapid Sampling Code for the paper: Greg Ver Steeg and Aram Galstyan. "Hamiltonian Dynamics with N

Greg Ver Steeg 25 Mar 14, 2022
PyTorch for Semantic Segmentation

PyTorch for Semantic Segmentation This repository contains some models for semantic segmentation and the pipeline of training and testing models, impl

Zijun Deng 1.7k Jan 06, 2023
B2EA: An Evolutionary Algorithm Assisted by Two Bayesian Optimization Modules for Neural Architecture Search

B2EA: An Evolutionary Algorithm Assisted by Two Bayesian Optimization Modules for Neural Architecture Search This is the offical implementation of the

SNU ADSL 0 Feb 07, 2022
KIDA: Knowledge Inheritance in Data Aggregation

KIDA: Knowledge Inheritance in Data Aggregation This project releases our 1st place solution on NeurIPS2021 ML4CO Dual Task. Slide and model weights a

24 Sep 08, 2022
Implementation and replication of ProGen, Language Modeling for Protein Generation, in Jax

ProGen - (wip) Implementation and replication of ProGen, Language Modeling for Protein Generation, in Pytorch and Jax (the weights will be made easily

Phil Wang 71 Dec 01, 2022
Official implementation for paper Knowledge Bridging for Empathetic Dialogue Generation (AAAI 2021).

Knowledge Bridging for Empathetic Dialogue Generation This is the official implementation for paper Knowledge Bridging for Empathetic Dialogue Generat

Qintong Li 50 Dec 20, 2022