PyTorch implementation of the ExORL: Exploratory Data for Offline Reinforcement Learning

Last update: Jan 01, 2023

Overview

ExORL: Exploratory Data for Offline Reinforcement Learning

This is an original PyTorch implementation of the ExORL framework from

Don't Change the Algorithm, Change the Data: Exploratory Data for Offline Reinforcement Learning by

Denis Yarats*, David Brandfonbrener*, Hao Liu, Misha Laskin, Pieter Abbeel, Alessandro Lazaric, and Lerrel Pinto.

*Equal contribution.

Prerequisites

Install MuJoCo if it is not already the case:

Download MuJoCo binaries here.
Unzip the downloaded archive into ~/.mujoco/.
Append the MuJoCo subdirectory bin path into the env variable LD_LIBRARY_PATH.

Install the following libraries:

sudo apt update
sudo apt install libosmesa6-dev libgl1-mesa-glx libglfw3 unzip

Install dependencies:

conda env create -f conda_env.yml
conda activate exorl

Datasets

We provide exploratory datasets for 6 DeepMind Control Stuite domains

Domain	Dataset name	Available task names
Cartpole	`cartpole`	`cartpole_balance`, `cartpole_balance_sparse`, `cartpole_swingup`, `cartpole_swingup_sparse`
Cheetah	`cheetah`	`cheetah_run`, `cheetah_run_backward`
Jaco Arm	`jaco`	`jaco_reach_top_left`, `jaco_reach_top_right`, `jaco_reach_bottom_left`, `jaco_reach_bottom_right`
Point Mass Maze	`point_mass_maze`	`point_mass_maze_reach_top_left`, `point_mass_maze_reach_top_right`, `point_mass_maze_reach_bottom_left`, `point_mass_maze_reach_bottom_right`
Quadruped	`quadruped`	`quadruped_walk`, `quadruped_run`
Walker	`walker`	`walker_stand`, `walker_walk`, `walker_run`

For each domain we collected datasets by running 9 unsupervised RL algorithms from URLB for total of 10M steps. Here is the list of algorithms

Unsupervised RL method	Name	Paper
APS	`aps`	paper
APT(ICM)	`icm_apt`	paper
DIAYN	`diayn`	paper
Disagreement	`disagreement`	paper
ICM	`icm`	paper
ProtoRL	`proto`	paper
Random	`random`	N/A
RND	`rnd`	paper
SMM	`smm`	paper

You can download a dataset by running ./download.sh, for example to download ProtoRL dataset for Walker, run

./download.sh walker proto

The script will download the dataset from S3 and store it under datasets/walker/proto/, where you can find episodes (under buffer) and episode videos (under video).

Offline RL training

We also provide implementation of 5 offline RL algorithms for evaluating the datasets

Offline RL method	Name	Paper
Behavior Cloning	`bc`	paper
CQL	`cql`	paper
CRR	`crr`	paper
TD3+BC	`td3_bc`	paper
TD3	`td3`	paper

After downloading required datasets, you can evaluate it using offline RL methon for a specific task. For example, to evaluate a dataset collected by ProtoRL on Walker for the waling task using TD3+BC you can run

python train_offline.py agent=td3_bc expl_agent=proto task=walker_walk

Logs are stored in the output folder. To launch tensorboard run:

tensorboard --logdir output

Citation

If you use this repo in your research, please consider citing the paper as follows:

@article{yarats2022exorl,
  title={Don't Change the Algorithm, Change the Data: Exploratory Data for Offline Reinforcement Learning},
  author={Denis Yarats, David Brandfonbrener, Hao Liu, Michael Laskin, Pieter Abbeel, Alessandro Lazaric, Lerrel Pinto},
  journal={arXiv preprint arXiv:2201.13425},
  year={2022}
}

License

The majority of ExORL is licensed under the MIT license, however portions of the project are available under separate license terms: DeepMind is licensed under the Apache 2.0 license.

PyTorch implementation of the ExORL: Exploratory Data for Offline Reinforcement Learning

Related tags

Overview

ExORL: Exploratory Data for Offline Reinforcement Learning

Prerequisites

Datasets

Offline RL training

Citation

License

Owner

Denis Yarats

A voice recognition assistant similar to amazon alexa, siri and google assistant.

Code for SALT: Stackelberg Adversarial Regularization, EMNLP 2021.

Users can free try their models on SIDD dataset based on this code

A implemetation of the LRCN in mxnet

meProp: Sparsified Back Propagation for Accelerated Deep Learning (ICML 2017)

K-Means Clustering and Hierarchical Clustering Unsupervised Learning Solution in Python3.

A curated list of awesome Model-Based RL resources

sense-py-AnishaBaishya created by GitHub Classroom

AutoPentest-DRL: Automated Penetration Testing Using Deep Reinforcement Learning

Source code for "Progressive Transformers for End-to-End Sign Language Production" (ECCV 2020)

Code repository for "Free View Synthesis", ECCV 2020.

Bling's Object detection tool

Evaluation suite for large-scale language models.

Pomodoro timer that acknowledges the inexorable, infinite passage of time

Vehicle Detection Using Deep Learning and YOLO Algorithm

Code release for NeurIPS 2020 paper "Co-Tuning for Transfer Learning"

DC3: A Learning Method for Optimization with Hard Constraints

A PyTorch Reimplementation of TecoGAN: Temporally Coherent GAN for Video Super-Resolution

This repository contains the re-implementation of our paper deSpeckNet: Generalizing Deep Learning Based SAR Image Despeckling

Simple keras FCN Encoder/Decoder model for MS-COCO (food subset) segmentation