RLDS stands for Reinforcement Learning Datasets

Related tags

Deep Learningrlds
Overview

RLDS

RLDS stands for Reinforcement Learning Datasets and it is an ecosystem of tools to store, retrieve and manipulate episodic data in the context of Sequential Decision Making including Reinforcement Learning (RL), Learning for Demonstrations, Offline RL or Imitation Learning.

This repository includes a library for manipulating RLDS compliant datasets. For other parts of the pipeline please refer to:

  • EnvLogger to create synthetic datasets
  • RLDS Creator to create datasets where a human interacts with an environment.
  • TFDS for existing RL datasets.

QuickStart & Colabs

See how to use RLDS in this tutorial.

You can find more examples in the following colabs:

Dataset Format

The dataset is retrieved as a tf.data.Dataset of Episodes where each episode contains a tf.data.Dataset of steps.

drawing

  • Episode: dictionary that contains a tf.data.Dataset of Steps, and metadata.

  • Step: dictionary that contains:

    • observation: current observation
    • action: action taken in the current observation
    • reward: return after appyling the action to the current observation
    • is_terminal: if this is a terminal step
    • is_first: if this is the first step of an episode that contains the initial state.
    • is_last: if this is the last step of an episode, that contains the last observation. When true, action, reward and discount, and other cutom fields subsequent to the observation are considered invalid.
    • discount: discount factor at this step.
    • extra metadata

    When is_terminal = True, the observation corresponds to a final state, so reward, discount and action are meaningless. Depending on the environment, the final observation may also be meaningless.

    If an episode ends in a step where is_terminal = False, it means that this episode has been truncated. In this case, depending on the environment, the action, reward and discount might be empty as well.

How to create a dataset

Although you can read datasets with the RLDS format even if they were not created with our tools (for example, by adding them to TFDS), we recommend the use of EnvLogger and RLDS Creator as they ensure that the data is stored in a lossless fashion and compatible with RLDS.

Synthetic datasets

Envlogger provides a dm_env Environment class wrapper that records interactions between a real environment and an agent.

env = envloger.EnvironmentLogger(
      environment,
      data_directory=`/tmp/mydataset`)

Besides, two callbacks can be passed to the EnviromentLogger constructor to store per-step metadata and per-episode metadata. See the EnvLogger documentation for more details.

Note that per-session metadata can be stored but is currently ignored when loading the dataset.

Note that the Envlogger follows the dm_env convention. So considering:

  • o_i: observation at step i
  • a_i: action applied to o_i
  • r_i: reward obtained when applying a_i in o_i
  • d_i: discount for reward r_i
  • m_i: metadata for step i

Data is generated and stored as:

    (o_0, _, _, _, m_0) → (o_1, a_0, r_0, d_0, m_1)  → (o_2, a_1, r_1, d_1, m_2) ⇢ ...

But loaded with RLDS as:

    (o_0,a_0, r_0, d_0, m_0) → (o_1, a_1, r_1, d_1, m_1)  → (o_2, a_2, r_2, d_2, m_2) ⇢ ...

Human datasets

If you want to collect data generated by a human interacting with an environment, check the RLDS Creator.

How to load a dataset

RL datasets can be loaded with TFDS and they are retrieved with the canonical RLDS dataset format.

See this section for instructions on how to add an RLDS dataset to TFDS.

Load with TFDS

Datasets in the TFDS catalog

These datasets can be loaded directly with:

tfds.load('dataset_name').as_dataset()['train']

This is how we load the datasets in the tutorial.

See the full documentation and the catalog in the [TFDS] site.

Datasets in your own repository

Datasets can be implemented with TFDS both inside and outside of the TFDS repository. See examples here.

How to add your dataset to TFDS

Adding a dataset to TFDS involves two steps:

  • Implement a python class that provides a dataset builder with the specs of the data (e.g., what is the shape of the observations, actions, etc.) and how to read your dataset files.

  • Run a download_and_prepare pipeline that converts the data to the TFDS intermediate format.

You can add your dataset directly to TFDS following the instructions at https://www.tensorflow.org/datasets.

  • If your data has been generated with Envlogger or the RLDS Creator, you can just use the rlds helpers in TFDS (see here an example).
  • Otherwise, make sure your generate_examples implementation provides the same structure and keys as RLDS loaders if you want your dataset to be compatible with RLDS pipelines (example).

Note that you can follow the same steps to add the data to your own repository (see more details in the TFDS documentation).

Performance best practices

As RLDS exposes RL datasets in a form of Tensorflow's tf.data, many Tensorflow's performance hints apply to RLDS as well. It is important to note, however, that RLDS datasets are very specific and not all general speed-up methods work out of the box. advices on improving performance might not result in expected outcome. To get a better understanding on how to use RLDS datasets effectively we recommend going through this colab.

Citation

If you use RLDS, please cite the RLDS paper as

@misc{ramos2021rlds,
      title={RLDS: an Ecosystem to Generate, Share and Use Datasets in Reinforcement Learning},
      author={Sabela Ramos and Sertan Girgin and Léonard Hussenot and Damien Vincent and Hanna Yakubovich and Daniel Toyama and Anita Gergely and Piotr Stanczyk and Raphael Marinier and Jeremiah Harmsen and Olivier Pietquin and Nikola Momchev},
      year={2021},
      eprint={2111.02767},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

Acknowledgements

We greatly appreciate all the support from the TF-Agents team in setting up building and testing for EnvLogger.

Disclaimer

This is not an officially supported Google product.

Owner
Google Research
Google Research
Python scripts to detect faces in Python with the BlazeFace Tensorflow Lite models

Python scripts to detect faces using Python with the BlazeFace Tensorflow Lite models. Tested on Windows 10, Tensorflow 2.4.0 (Python 3.8).

Ibai Gorordo 46 Nov 17, 2022
Denoising Normalizing Flow

Denoising Normalizing Flow Christian Horvat and Jean-Pascal Pfister 2021 We combine Normalizing Flows (NFs) and Denoising Auto Encoder (DAE) by introd

CHrvt 17 Oct 15, 2022
BARTScore: Evaluating Generated Text as Text Generation

This is the Repo for the paper: BARTScore: Evaluating Generated Text as Text Generation Updates 2021.06.28 Release online evaluation Demo 2021.06.25 R

NeuLab 196 Dec 17, 2022
Nonnegative spatial factorization for multivariate count data

Nonnegative spatial factorization for multivariate count data This repository contains supporting code to facilitate reproducible analysis. For detail

Will Townes 24 Dec 19, 2022
Code and project page for ICCV 2021 paper "DisUnknown: Distilling Unknown Factors for Disentanglement Learning"

DisUnknown: Distilling Unknown Factors for Disentanglement Learning See introduction on our project page Requirements PyTorch = 1.8.0 torch.linalg.ei

Sitao Xiang 24 May 16, 2022
Official page of Struct-MDC (RA-L'22 with IROS'22 option); Depth completion from Visual-SLAM using point & line features

Struct-MDC (click the above buttons for redirection!) Official page of "Struct-MDC: Mesh-Refined Unsupervised Depth Completion Leveraging Structural R

Urban Robotics Lab. @ KAIST 37 Dec 22, 2022
"Learning and Analyzing Generation Order for Undirected Sequence Models" in Findings of EMNLP, 2021

undirected-generation-dev This repo contains the source code of the models described in the following paper "Learning and Analyzing Generation Order f

Yichen Jiang 0 Mar 25, 2022
Official implementation for paper Knowledge Bridging for Empathetic Dialogue Generation (AAAI 2021).

Knowledge Bridging for Empathetic Dialogue Generation This is the official implementation for paper Knowledge Bridging for Empathetic Dialogue Generat

Qintong Li 50 Dec 20, 2022
Training, generation, and analysis code for Learning Particle Physics by Example: Location-Aware Generative Adversarial Networks for Physics

Location-Aware Generative Adversarial Networks (LAGAN) for Physics Synthesis This repository contains all the code used in L. de Oliveira (@lukedeo),

Deep Learning for HEP 57 Oct 22, 2022
Share a benchmark that can easily apply reinforcement learning in Job-shop-scheduling

Gymjsp Gymjsp is an open source Python library, which uses the OpenAI Gym interface for easily instantiating and interacting with RL environments, and

134 Dec 08, 2022
YOLOX is a high-performance anchor-free YOLO, exceeding yolov3~v5 with ONNX, TensorRT, ncnn, and OpenVINO supported.

Introduction YOLOX is an anchor-free version of YOLO, with a simpler design but better performance! It aims to bridge the gap between research and ind

7.7k Jan 03, 2023
NNR conformation conditional and global probabilities estimation and analysis in peptides or proteins fragments

NNR and global probabilities estimation and analysis in peptides or protein fragments This module calculates global and NNR conformation dependent pro

0 Jul 15, 2021
Springer Link Download Module for Python

♞ pupalink A simple Python module to search and download books from SpringerLink. 🧪 This project is still in an early stage of development. Expect br

Pupa Corp. 18 Nov 21, 2022
DCSAU-Net: A Deeper and More Compact Split-Attention U-Net for Medical Image Segmentation

DCSAU-Net: A Deeper and More Compact Split-Attention U-Net for Medical Image Segmentation By Qing Xu, Wenting Duan and Na He Requirements pytorch==1.1

Qing Xu 20 Dec 09, 2022
This project is the official implementation of our accepted ICLR 2021 paper BiPointNet: Binary Neural Network for Point Clouds.

BiPointNet: Binary Neural Network for Point Clouds Created by Haotong Qin, Zhongang Cai, Mingyuan Zhang, Yifu Ding, Haiyu Zhao, Shuai Yi, Xianglong Li

Haotong Qin 59 Dec 17, 2022
Code for the paper "Curriculum Dropout", ICCV 2017

Curriculum Dropout Dropout is a very effective way of regularizing neural networks. Stochastically "dropping out" units with a certain probability dis

Pietro Morerio 21 Jan 02, 2022
Semi-supervised Implicit Scene Completion from Sparse LiDAR

Semi-supervised Implicit Scene Completion from Sparse LiDAR Paper Created by Pengfei Li, Yongliang Shi, Tianyu Liu, Hao Zhao, Guyue Zhou and YA-QIN ZH

114 Nov 30, 2022
alfred-py: A deep learning utility library for **human**

Alfred Alfred is command line tool for deep-learning usage. if you want split an video into image frames or combine frames into a single video, then a

JinTian 800 Jan 03, 2023
[ICCV'21] PlaneTR: Structure-Guided Transformers for 3D Plane Recovery

PlaneTR: Structure-Guided Transformers for 3D Plane Recovery This is the official implementation of our ICCV 2021 paper News There maybe some bugs in

73 Nov 30, 2022
Scikit-event-correlation - Event Correlation and Forecasting over High Dimensional Streaming Sensor Data algorithms

scikit-event-correlation Event Correlation and Changing Detection Algorithm Theo

Intellia ICT 5 Oct 30, 2022