MazeRL is an application oriented Deep Reinforcement Learning (RL) framework

Last update: Dec 24, 2022

Overview

Applied Reinforcement Learning with Python

MazeRL is an application oriented Deep Reinforcement Learning (RL) framework, addressing real-world decision problems. Our vision is to cover the complete development life cycle of RL applications ranging from simulation engineering up to agent development, training and deployment.

This is a preliminary, non-stable release of Maze. It is not yet complete and not all of our interfaces have settled yet. Hence, there might be some breaking changes on our way towards the first stable release.

Spotlight Features

Below we list a few selected Maze features.

Design and visualize your policy and value networks with the Perception Module. It is based on PyTorch and provides a large variety of neural network building blocks and model styles. Quickly compose powerful representation learners from building blocks such as: dense, convolution, graph convolution and attention, recurrent architectures, action- and observation masking, self-attention etc.
Create the conditions for efficient RL training without writing boiler plate code, e.g. by supporting best practices like pre-processing and normalizing your observations.
Maze supports advanced environment structures reflecting the requirements of real-world industrial decision problems such as multi-step and multi-agent scenarios. You can of course work with existing Gym-compatible environments.
Use the provided Maze trainers (A2C, PPO, Impala, SAC, Evolution Strategies), which are supporting dictionary action and observation spaces as well as multi-step (auto-regressive policies) training. Or stick to your favorite tools and trainers by combining Maze with other RL frameworks.
Out of the box support for advanced training workflows such as imitation learning from teacher policies and policy fine-tuning.
Keep even complex application and experiment configuration manageable with the Hydra Config System.

Get Started

Make sure PyTorch is installed and then get the latest released version of Maze as follows
```
pip install -U maze-rl

# optionally install RLLib if you want to use it in combination with Maze
pip install ray[rllib] tensorflow  
```
Read more about other options like the installation of the latest development version.

⚡ We encourage you to start with Python 3.7, as many popular environments like Atari or Box2D can not easily be installed in newer Python environments. Maze itself supports newer Python versions, but for Python 3.9 you might have to install additional binary dependencies manually
To see Maze in action check out a first example.
For a more applied introduction visit the step by step tutorial.

Installation

First Example

Step by Step Tutorial

Documentation

Learn more about Maze

The documentation is the starting point to learn more about the underlying concepts, but most importantly also provides code snippets and minimum working examples to get you started quickly.

The Workflow section guides you through typical tasks in a RL project
Policy and Value Networks introduces you to the Perception Module, how to customize action spaces and the underlying action probability distributions and two styles of policy and value networks construction:
- Template models are composed directly from an environment's observation and action space, allowing you to train with suitable agent networks on a new environment within minutes.
- Custom models gives you the full flexibility of application specific models, either with the provided Maze building blocks or directly with PyTorch.
Learn more about core concepts and structures such as the Maze environment hierarchy, the Maze event system providing a convenient way to collect statistics and KPIs, enable flexible reward formulation and supporting offline analysis.
Structured Environments and Action Masking introduces you to a general concept, which can greatly improve the performance of the trained agents in practical RL problems.

License

Maze is freely available for research and non-commercial use. A commercial license is available, if interested please contact us on our company website or write us an email.

We believe in Open Source principles and aim at transitioning Maze to a commercial Open Source project, releasing larger parts of the framework under a permissive license in the near future.

Comments

Configuration problems in the step-by-step tutorial
I've just been trying out maze and tried out the step-by-step tutorial.

In Step 5 (5. Training the MazeEnv) the instructions are incomplete or wrong.

I was able to get it running in the end, but it took (us) quite some time. I'm not sure if this is a bug in maze or hydra, of if just some newer version of either library changes the behavior a little bit. But you should update the documentation such that it works out of the box for new users of the library.

The setup (under Ubuntu 2020.04):

>> mkdir maze5 && cd maze5 >> pyenv local 3.8.8 >> python -m venv .venv >> source .venv/bin/activate >> pip install maze-rl torch >> pip list Package Version ----------------------- ----------- hydra-core 1.1.0 hydra-nevergrad-sweeper 1.1.5 maze-rl 0.1.7 torch 1.9.0 ...

Then just copy-pasted the files from the https://github.com/enlite-ai/maze-examples/tree/main/tutorial_maze_env/part03_maze_env repo and adjusted the _target paths in the config yamls (e.g. from _target_: tutorial_maze_env.part03_maze_env.env.maze_env.maze_env_factory to _target_: env.maze_env.maze_env_factory).

Problem 1:

When you run the suggested training command, Hydra will just complain that it can't find the configuration files.

>> maze-run -cn conf_train env=tutorial_cutting_2d_basic wrappers=tutorial_cutting_2d_basic \ model=tutorial_cutting_2d_basic algorithm=ppo In 'conf_train': Could not find 'model/tutorial_cutting_2d_basic' Available options in 'model': flatten_concat flatten_concat_shared_embedding pixel_obs pixel_obs_rnn rllib vector_obs vector_obs_rnn Config search path: provider=hydra, path=pkg://hydra.conf provider=main, path=pkg://maze.conf provider=schema, path=structured://

Fix:

You can just define the config directory for hydra with maze-run -cd conf -cn conf_train .... Then Hydra will find the 3 config files and load them correctly.

Problem 2:

After loading the config files, hydra tries to load the modules defined in the _target fields. And that fails immediatly with:

... File "***/maze5-uWAZh5bh/lib/python3.8/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 104, in _resolve_target return _locate(target) File "***/maze5-uWAZh5bh/lib/python3.8/site-packages/hydra/_internal/utils.py", line 563, in _locate raise ImportError(f"Error loading module '{path}'") from e ImportError: Error loading module 'env.maze_env.maze_env_factory'

Fix:

For some reason Hydra doesn't know the path to the directory from where we call maze-run. And therefore it doesn't find the env directory containing the maze_env file.

This is fixable by just setting the environment variable: export PYTHONPATH="$PYTHONPATH:$PWD/".
bug documentation
opened by jakobkogler 2
Hello from Hydra :)

Thanks for using Hydra! I see that you are using Hydra 1.1 already which is great. One thing that is really recent is the ability to configure the config searchpath from the primary config. You can learn about it here.

This can probably eliminate the need of your users to even know what a ConfigSearchpathPlugin is.

Feel free to jump into the Hydra chat if you have any questions.

opened by omry 2
Version 0.1.7
Adds Soft Actor-Critic (SAC) Trainer (supporting Dictionary Observations and Actions)

Simplifies the reward aggregation interface (now also supports multi-agent training)

Extends PPO and A2C to multi-agent capable actor-critic trainers (individual agents vs. centralized critic)

Adds option for custom rollout evaluators

Adds option for shared weights in actor-critic settings

Adds experiment and multi-run support for RunContext Python API
opened by enliteai 0
Version 0.1.6
Changes

made Maze compatible to Rllib 1.4

updated to the recently released hydra 1.1.0

Simpified API (RunContext): Experiment and evaluation support

Fixed support of the nevergrad sweeper: made the LocalLauncher hydra plugin part of the wheel

Replaced the (policy id, actor id) tuple with an ActorID class

Other

various documentation improvements

added ready-to-go Docker containers

contribution guidelines, pull request templates etc. on GitHub
opened by md-enlite 0
Version 0.1.5
Features:

Adds documentation for run_context

Changes of simulated environment interfaces step_without_observation -> fast_step

Adds seeding to environments, models and trainers

Initial commit of the Maze Python API

Adds an ExportGifWrapper

Adds network architecture visualizations to Tensorboard Images

adds incremental min/max stats

adds categorical (support-based) value networks

added value transformations
opened by md-enlite 0
Towards Version 0.1.5
Adds seeding to environments, models and trainers

Initial commit of the Maze Python API

Adds an ExportGifWrapper

Adds network architecture visualizations to Tensorboard Images
opened by md-enlite 0
Release Version 0.1.4
improved docs

switch to RLlib version 1.3.0.

full structured env support

policy interface now selects policy based on actor_id

added testing dependencies to main package
opened by enliteai 0
Dev
adds PointNetFeatureBlock to perception module

adds Tensorboard hyper paramter visualization for hydra multiruns

merges parallel and sequential dataset into a single InMemoryDataset
opened by md-enlite 0
Version 0.1.3
Improvements:

Enable event collection from within the Wrapper stack

Aligned StepSkipWrapper with the event system

MonitoringWrapper: Logging of observations, actions and rewards throughout the wrapper stack, useful for diagnosis

Make _recursive_ in Hydra config files compatible with Maze object instantiation
opened by enliteai 0
Version 0.1.2
Features:

Imitation Learning:

Added Evaluation Rollouts

Unified dataset structures (InMemoryDataset)

GlobalPoolingBlock: now supports sum and max pooling

ObservationNormalizationWrapper: Adds observation and observation distribution visualization to Tensorboard logging.

Distribution: Introduced VectorEnv, refactored the single and multi process parallelization wrappers.
opened by enliteai 0
Dev
Features:

hyper parameter optimization via grid search and Nevergrad

plain python training example

local hydra job launcher

extend attention/transformer perception blocks

Fixes:

cumulative stats logging
opened by md-enlite 0

Releases(v0.2.0)

v0.2.0(Nov 21, 2022)
New graph neural network building blocks (message passing based on torch-scatter in addition to existing graph convolutions)

Support for action recording, replay from pre-computed action records and feature collection.

Improved wrapper hierarchy semantics: Previously values were assigned to the outermost wrapper. Now values are assigned to existing attributes by traversing the wrapper hierarchy.

Removal of deprecated modules (APIContext and Maze models for RLlib)

Reflecting changes in upstream dependencies (Gym version pinned to <0.23)

Source code(tar.gz)
Source code(zip)
v0.1.8(Dec 13, 2021)
New Features

Agent Deployment Workflow

Soft Actor Critic from Demonstrations (SACfD)

Locally Distributed ES Runner

SpacesRecordingWrapper: Records and dumps processed trajectories to pickle files

Fixes event logging for environment resets and policy events

Source code(tar.gz)
Source code(zip)
submission_22-08-25-14-06.1.zip(252.75 MB)
v0.1.7(Jun 24, 2021)
Adds Soft Actor-Critic (SAC) Trainer (supporting Dictionary Observations and Actions)

Simplifies the reward aggregation interface (now also supports multi-agent training)

Extends PPO and A2C to multi-agent capable actor-critic trainers (individual agents vs. centralized critic)

Adds option for custom rollout evaluators

Adds option for shared weights in actor-critic settings

Adds experiment and multi-run support for RunContext Python API

Compatibility with PyTorch 1.9

Source code(tar.gz)
Source code(zip)
v0.1.6(Jun 14, 2021)
Changes

made Maze compatible to Rllib 1.4

updated to the recently released hydra 1.1.0

Simplified API (RunContext): Experiment and evaluation support

Fixed support of the nevergrad sweeper: made the LocalLauncher hydra plugin part of the wheel

Replaced the (policy id, actor id) tuple with an ActorID class

Other

various documentation improvements

added ready-to-go Docker containers

contribution guidelines, pull request templates etc. on GitHub

Source code(tar.gz)
Source code(zip)
v0.1.5(May 20, 2021)
Features:

adds RunContext (Maze Python API)

adds seeding to environments, models and trainers

changes of simulated environment interfaces step_without_observation -> fast_step

Improvements:

adds an ExportGifWrapper

adds network architecture visualizations to Tensorboard Images

adds incremental min/max stats

adds categorical (support-based) value networks

adds value transformations

Source code(tar.gz)
Source code(zip)
v0.1.4(Apr 29, 2021)
switch to RLlib version 1.3.0.

full structured env support

policy interface now selects policy based on actor_id

interfaces support collaborative multi-agent actor critic

improved docs

added testing dependencies to main package

Source code(tar.gz)
Source code(zip)
v0.1.3(Apr 1, 2021)
Improvements:

Enable event collection from within the Wrapper stack

Aligned StepSkipWrapper with the event system

MonitoringWrapper: Logging of observations, actions and rewards throughout the wrapper stack, useful for diagnosis

Make _recursive_ in Hydra config files compatible with Maze object instantiation

Source code(tar.gz)
Source code(zip)
v0.1.2(Mar 25, 2021)
Features:

Imitation Learning:

Added Evaluation Rollouts

Unified dataset structures (InMemoryDataset)

GlobalPoolingBlock: now supports sum and max pooling

ObservationNormalizationWrapper: Adds observation and observation distribution visualization to Tensorboard logging.

Distribution: Introduced VectorEnv, refactored the single and multi process parallelization wrappers.

Source code(tar.gz)
Source code(zip)
v0.1.1(Mar 18, 2021)
Features:

hyper parameter optimization via grid search and Nevergrad

plain python training example

local hydra job launcher

extend attention/transformer perception blocks

adds MazeEnvMonitoringWrapper as a default to wrapper stacks

Fixes:

cumulative stats logging

Source code(tar.gz)
Source code(zip)
v0.1.0(Mar 11, 2021)
Documentation updates:

Integrating existing Gym environments

Factory documentation

Experiments workflow, ...

Updated to Hydra 1.1.0:

Using Hydra.instantiate instead of custom registry implementation

Added Rollout evaluator
Source code(tar.gz)
Source code(zip)

Owner

EnliteAI GmbH

enliteAI is a machine learning company, developing the Reinforcement Learning framework Maze.

GitHub Repository https://maze-rl.readthedocs.io/

[ICCV 2021 Oral] Mining Latent Classes for Few-shot Segmentation

Mining Latent Classes for Few-shot Segmentation Lihe Yang, Wei Zhuo, Lei Qi, Yinghuan Shi, Yang Gao. This codebase contains baseline of our paper Mini

66 Nov 29, 2022

Beyond imagenet attack (accepted by ICLR 2022) towards crafting adversarial examples for black-box domains.

Beyond ImageNet Attack: Towards Crafting Adversarial Examples for Black-box Domains (ICLR'2022) This is the Pytorch code for our paper Beyond ImageNet

37 Nov 23, 2022

Torch implementation of various types of GAN (e.g. DCGAN, ALI, Context-encoder, DiscoGAN, CycleGAN, EBGAN, LSGAN)

gans-collection.torch Torch implementation of various types of GANs (e.g. DCGAN, ALI, Context-encoder, DiscoGAN, CycleGAN, EBGAN). Note that EBGAN and

53 Jan 22, 2022

Transformer in Vision

Transformer-in-Vision Recent Transformer-based CV and related works. Welcome to comment/contribute! Keep updated. Resource SCENIC: A JAX Library for C

1.1k Dec 30, 2022

This repository includes the official project for the paper: TransMix: Attend to Mix for Vision Transformers.

TransMix: Attend to Mix for Vision Transformers This repository includes the official project for the paper: TransMix: Attend to Mix for Vision Transf

130 Jan 01, 2023

Pytorch implementation of the popular Improv RNN model originally proposed by the Magenta team.

Pytorch Implementation of Improv RNN Overview This code is a pytorch implementation of the popular Improv RNN model originally implemented by the Mage

3 Nov 11, 2022

"Moshpit SGD: Communication-Efficient Decentralized Training on Heterogeneous Unreliable Devices", official implementation

Moshpit SGD: Communication-Efficient Decentralized Training on Heterogeneous Unreliable Devices This repository contains the official PyTorch implemen

21 Oct 18, 2022

Learning Dense Representations of Phrases at Scale (Lee et al., 2020)

DensePhrases DensePhrases provides answers to your natural language questions from the entire Wikipedia in real-time. While it efficiently searches th

540 Dec 30, 2022

This repo is a PyTorch implementation for Paper "Unsupervised Learning for Cuboid Shape Abstraction via Joint Segmentation from Point Clouds"

Unsupervised Learning for Cuboid Shape Abstraction via Joint Segmentation from Point Clouds This repository is a PyTorch implementation for paper: Uns

42 Dec 09, 2022

MazeRL is an application oriented Deep Reinforcement Learning (RL) framework

Related tags

Overview

Applied Reinforcement Learning with Python

Spotlight Features

Get Started

Learn more about Maze

License

Comments

Releases(v0.2.0)

v0.2.0(Nov 21, 2022)

v0.1.8(Dec 13, 2021)

v0.1.7(Jun 24, 2021)

v0.1.6(Jun 14, 2021)

v0.1.5(May 20, 2021)

v0.1.4(Apr 29, 2021)

v0.1.3(Apr 1, 2021)

v0.1.2(Mar 25, 2021)

v0.1.1(Mar 18, 2021)

v0.1.0(Mar 11, 2021)

Owner

EnliteAI GmbH

[ICCV 2021 Oral] Mining Latent Classes for Few-shot Segmentation

Beyond imagenet attack (accepted by ICLR 2022) towards crafting adversarial examples for black-box domains.

Torch implementation of various types of GAN (e.g. DCGAN, ALI, Context-encoder, DiscoGAN, CycleGAN, EBGAN, LSGAN)

Transformer in Vision

This repository includes the official project for the paper: TransMix: Attend to Mix for Vision Transformers.

Pytorch implementation of the popular Improv RNN model originally proposed by the Magenta team.

"Moshpit SGD: Communication-Efficient Decentralized Training on Heterogeneous Unreliable Devices", official implementation

Learning Dense Representations of Phrases at Scale (Lee et al., 2020)

This repo is a PyTorch implementation for Paper "Unsupervised Learning for Cuboid Shape Abstraction via Joint Segmentation from Point Clouds"

Pytorch implementation of Hinton's Dynamic Routing Between Capsules

Model serving at scale

Volsdf - Volume Rendering of Neural Implicit Surfaces

Lexical Substitution Framework

PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.

Neural network-based build time estimation for additive manufacturing

Codes for AAAI22 paper "Learning to Solve Travelling Salesman Problem with Hardness-Adaptive Curriculum"

Pytorch implementation for "Implicit Semantic Response Alignment for Partial Domain Adaptation"

Realtime Face Anti Spoofing with Face Detector based on Deep Learning using Tensorflow/Keras and OpenCV

Twin-deep neural network for semi-supervised learning of materials properties

The implement of papar "Enhanced Graph Learning for Collaborative Filtering via Mutual Information Maximization"