MazeRL is an application oriented Deep Reinforcement Learning (RL) framework

Overview

Banner

Applied Reinforcement Learning with Python

MazeRL is an application oriented Deep Reinforcement Learning (RL) framework, addressing real-world decision problems. Our vision is to cover the complete development life cycle of RL applications ranging from simulation engineering up to agent development, training and deployment.

This is a preliminary, non-stable release of Maze. It is not yet complete and not all of our interfaces have settled yet. Hence, there might be some breaking changes on our way towards the first stable release.

Spotlight Features

Below we list a few selected Maze features.

  • Design and visualize your policy and value networks with the Perception Module. It is based on PyTorch and provides a large variety of neural network building blocks and model styles. Quickly compose powerful representation learners from building blocks such as: dense, convolution, graph convolution and attention, recurrent architectures, action- and observation masking, self-attention etc.
  • Create the conditions for efficient RL training without writing boiler plate code, e.g. by supporting best practices like pre-processing and normalizing your observations.
  • Maze supports advanced environment structures reflecting the requirements of real-world industrial decision problems such as multi-step and multi-agent scenarios. You can of course work with existing Gym-compatible environments.
  • Use the provided Maze trainers (A2C, PPO, Impala, SAC, Evolution Strategies), which are supporting dictionary action and observation spaces as well as multi-step (auto-regressive policies) training. Or stick to your favorite tools and trainers by combining Maze with other RL frameworks.
  • Out of the box support for advanced training workflows such as imitation learning from teacher policies and policy fine-tuning.
  • Keep even complex application and experiment configuration manageable with the Hydra Config System.

Get Started

  • Make sure PyTorch is installed and then get the latest released version of Maze as follows

    pip install -U maze-rl
    
    # optionally install RLLib if you want to use it in combination with Maze
    pip install ray[rllib] tensorflow  
    

    Read more about other options like the installation of the latest development version.

    We encourage you to start with Python 3.7, as many popular environments like Atari or Box2D can not easily be installed in newer Python environments. Maze itself supports newer Python versions, but for Python 3.9 you might have to install additional binary dependencies manually

  • To see Maze in action check out a first example.

  • For a more applied introduction visit the step by step tutorial.

Pip
Installation
First Example
First Example
Tutorial
Step by Step Tutorial
Documentation
Documentation

Learn more about Maze

The documentation is the starting point to learn more about the underlying concepts, but most importantly also provides code snippets and minimum working examples to get you started quickly.

License

Maze is freely available for research and non-commercial use. A commercial license is available, if interested please contact us on our company website or write us an email.

We believe in Open Source principles and aim at transitioning Maze to a commercial Open Source project, releasing larger parts of the framework under a permissive license in the near future.

Comments
  • Configuration problems in the step-by-step tutorial

    Configuration problems in the step-by-step tutorial

    I've just been trying out maze and tried out the step-by-step tutorial.

    In Step 5 (5. Training the MazeEnv) the instructions are incomplete or wrong.

    I was able to get it running in the end, but it took (us) quite some time. I'm not sure if this is a bug in maze or hydra, of if just some newer version of either library changes the behavior a little bit. But you should update the documentation such that it works out of the box for new users of the library.


    The setup (under Ubuntu 2020.04):

    >> mkdir maze5 && cd maze5
    >> pyenv local 3.8.8
    >> python -m venv .venv
    >> source .venv/bin/activate
    >> pip install maze-rl torch
    >> pip list
    Package                 Version
    ----------------------- -----------
    hydra-core              1.1.0
    hydra-nevergrad-sweeper 1.1.5
    maze-rl                 0.1.7
    torch                   1.9.0
    ...
    

    Then just copy-pasted the files from the https://github.com/enlite-ai/maze-examples/tree/main/tutorial_maze_env/part03_maze_env repo and adjusted the _target paths in the config yamls (e.g. from _target_: tutorial_maze_env.part03_maze_env.env.maze_env.maze_env_factory to _target_: env.maze_env.maze_env_factory).

    Problem 1:

    When you run the suggested training command, Hydra will just complain that it can't find the configuration files.

    >> maze-run -cn conf_train env=tutorial_cutting_2d_basic wrappers=tutorial_cutting_2d_basic \
        model=tutorial_cutting_2d_basic algorithm=ppo
    In 'conf_train': Could not find 'model/tutorial_cutting_2d_basic'
    
    Available options in 'model':
            flatten_concat
            flatten_concat_shared_embedding
            pixel_obs
            pixel_obs_rnn
            rllib
            vector_obs
            vector_obs_rnn
    Config search path:
            provider=hydra, path=pkg://hydra.conf
            provider=main, path=pkg://maze.conf
            provider=schema, path=structured://
    

    Fix:

    You can just define the config directory for hydra with maze-run -cd conf -cn conf_train .... Then Hydra will find the 3 config files and load them correctly.

    Problem 2:

    After loading the config files, hydra tries to load the modules defined in the _target fields. And that fails immediatly with:

      ...
      File "***/maze5-uWAZh5bh/lib/python3.8/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 104, in _resolve_target
        return _locate(target)
      File "***/maze5-uWAZh5bh/lib/python3.8/site-packages/hydra/_internal/utils.py", line 563, in _locate
        raise ImportError(f"Error loading module '{path}'") from e
    
    ImportError: Error loading module 'env.maze_env.maze_env_factory'
    

    Fix:

    For some reason Hydra doesn't know the path to the directory from where we call maze-run. And therefore it doesn't find the env directory containing the maze_env file.

    This is fixable by just setting the environment variable: export PYTHONPATH="$PYTHONPATH:$PWD/".

    bug documentation 
    opened by jakobkogler 2
  • Hello from Hydra :)

    Hello from Hydra :)

    Thanks for using Hydra! I see that you are using Hydra 1.1 already which is great. One thing that is really recent is the ability to configure the config searchpath from the primary config. You can learn about it here.

    This can probably eliminate the need of your users to even know what a ConfigSearchpathPlugin is.

    Feel free to jump into the Hydra chat if you have any questions.

    opened by omry 2
  • Version 0.1.7

    Version 0.1.7

    • Adds Soft Actor-Critic (SAC) Trainer (supporting Dictionary Observations and Actions)
    • Simplifies the reward aggregation interface (now also supports multi-agent training)
    • Extends PPO and A2C to multi-agent capable actor-critic trainers (individual agents vs. centralized critic)
    • Adds option for custom rollout evaluators
    • Adds option for shared weights in actor-critic settings
    • Adds experiment and multi-run support for RunContext Python API
    opened by enliteai 0
  • Version 0.1.6

    Version 0.1.6

    Changes

    • made Maze compatible to Rllib 1.4
    • updated to the recently released hydra 1.1.0
    • Simpified API (RunContext): Experiment and evaluation support
    • Fixed support of the nevergrad sweeper: made the LocalLauncher hydra plugin part of the wheel
    • Replaced the (policy id, actor id) tuple with an ActorID class

    Other

    • various documentation improvements
    • added ready-to-go Docker containers
    • contribution guidelines, pull request templates etc. on GitHub
    opened by md-enlite 0
  • Version 0.1.5

    Version 0.1.5

    Features:

    • Adds documentation for run_context
    • Changes of simulated environment interfaces step_without_observation -> fast_step
    • Adds seeding to environments, models and trainers
    • Initial commit of the Maze Python API
    • Adds an ExportGifWrapper
    • Adds network architecture visualizations to Tensorboard Images
    • adds incremental min/max stats
    • adds categorical (support-based) value networks
    • added value transformations
    opened by md-enlite 0
  • Towards Version 0.1.5

    Towards Version 0.1.5

    • Adds seeding to environments, models and trainers
    • Initial commit of the Maze Python API
    • Adds an ExportGifWrapper
    • Adds network architecture visualizations to Tensorboard Images
    opened by md-enlite 0
  • Release Version 0.1.4

    Release Version 0.1.4

    • improved docs
    • switch to RLlib version 1.3.0.
    • full structured env support
      • policy interface now selects policy based on actor_id
    • added testing dependencies to main package
    opened by enliteai 0
  • Dev

    Dev

    • adds PointNetFeatureBlock to perception module
    • adds Tensorboard hyper paramter visualization for hydra multiruns
    • merges parallel and sequential dataset into a single InMemoryDataset
    opened by md-enlite 0
  • Version 0.1.3

    Version 0.1.3

    Improvements:

    • Enable event collection from within the Wrapper stack
    • Aligned StepSkipWrapper with the event system
    • MonitoringWrapper: Logging of observations, actions and rewards throughout the wrapper stack, useful for diagnosis
    • Make _recursive_ in Hydra config files compatible with Maze object instantiation
    opened by enliteai 0
  • Version 0.1.2

    Version 0.1.2

    Features:

    • Imitation Learning:
      • Added Evaluation Rollouts
      • Unified dataset structures (InMemoryDataset)
    • GlobalPoolingBlock: now supports sum and max pooling
    • ObservationNormalizationWrapper: Adds observation and observation distribution visualization to Tensorboard logging.
    • Distribution: Introduced VectorEnv, refactored the single and multi process parallelization wrappers.
    opened by enliteai 0
  • Dev

    Dev

    Features:

    • hyper parameter optimization via grid search and Nevergrad
    • plain python training example
    • local hydra job launcher
    • extend attention/transformer perception blocks

    Fixes:

    • cumulative stats logging
    opened by md-enlite 0
Releases(v0.2.0)
  • v0.2.0(Nov 21, 2022)

    • New graph neural network building blocks (message passing based on torch-scatter in addition to existing graph convolutions)
    • Support for action recording, replay from pre-computed action records and feature collection.
    • Improved wrapper hierarchy semantics: Previously values were assigned to the outermost wrapper. Now values are assigned to existing attributes by traversing the wrapper hierarchy.
    • Removal of deprecated modules (APIContext and Maze models for RLlib)
    • Reflecting changes in upstream dependencies (Gym version pinned to <0.23)
    Source code(tar.gz)
    Source code(zip)
  • v0.1.8(Dec 13, 2021)

  • v0.1.7(Jun 24, 2021)

    • Adds Soft Actor-Critic (SAC) Trainer (supporting Dictionary Observations and Actions)
    • Simplifies the reward aggregation interface (now also supports multi-agent training)
    • Extends PPO and A2C to multi-agent capable actor-critic trainers (individual agents vs. centralized critic)
    • Adds option for custom rollout evaluators
    • Adds option for shared weights in actor-critic settings
    • Adds experiment and multi-run support for RunContext Python API
    • Compatibility with PyTorch 1.9
    Source code(tar.gz)
    Source code(zip)
  • v0.1.6(Jun 14, 2021)

    Changes

    • made Maze compatible to Rllib 1.4
    • updated to the recently released hydra 1.1.0
    • Simplified API (RunContext): Experiment and evaluation support
    • Fixed support of the nevergrad sweeper: made the LocalLauncher hydra plugin part of the wheel
    • Replaced the (policy id, actor id) tuple with an ActorID class

    Other

    • various documentation improvements
    • added ready-to-go Docker containers
    • contribution guidelines, pull request templates etc. on GitHub
    Source code(tar.gz)
    Source code(zip)
  • v0.1.5(May 20, 2021)

    Features:

    • adds RunContext (Maze Python API)
    • adds seeding to environments, models and trainers
    • changes of simulated environment interfaces step_without_observation -> fast_step

    Improvements:

    • adds an ExportGifWrapper
    • adds network architecture visualizations to Tensorboard Images
    • adds incremental min/max stats
    • adds categorical (support-based) value networks
    • adds value transformations
    Source code(tar.gz)
    Source code(zip)
  • v0.1.4(Apr 29, 2021)

    • switch to RLlib version 1.3.0.
    • full structured env support
      • policy interface now selects policy based on actor_id
      • interfaces support collaborative multi-agent actor critic
    • improved docs
    • added testing dependencies to main package
    Source code(tar.gz)
    Source code(zip)
  • v0.1.3(Apr 1, 2021)

    Improvements:

    • Enable event collection from within the Wrapper stack
    • Aligned StepSkipWrapper with the event system
    • MonitoringWrapper: Logging of observations, actions and rewards throughout the wrapper stack, useful for diagnosis
    • Make _recursive_ in Hydra config files compatible with Maze object instantiation
    Source code(tar.gz)
    Source code(zip)
  • v0.1.2(Mar 25, 2021)

    Features:

    • Imitation Learning:
      • Added Evaluation Rollouts
      • Unified dataset structures (InMemoryDataset)
    • GlobalPoolingBlock: now supports sum and max pooling
    • ObservationNormalizationWrapper: Adds observation and observation distribution visualization to Tensorboard logging.
    • Distribution: Introduced VectorEnv, refactored the single and multi process parallelization wrappers.
    Source code(tar.gz)
    Source code(zip)
  • v0.1.1(Mar 18, 2021)

    Features:

    • hyper parameter optimization via grid search and Nevergrad
    • plain python training example
    • local hydra job launcher
    • extend attention/transformer perception blocks
    • adds MazeEnvMonitoringWrapper as a default to wrapper stacks

    Fixes:

    • cumulative stats logging
    Source code(tar.gz)
    Source code(zip)
  • v0.1.0(Mar 11, 2021)

    Documentation updates:

    • Integrating existing Gym environments
    • Factory documentation
    • Experiments workflow, ...

    Updated to Hydra 1.1.0:

    • Using Hydra.instantiate instead of custom registry implementation

    Added Rollout evaluator

    Source code(tar.gz)
    Source code(zip)
Owner
EnliteAI GmbH
enliteAI is a machine learning company, developing the Reinforcement Learning framework Maze.
EnliteAI GmbH
Code for ACM MM 2020 paper "NOH-NMS: Improving Pedestrian Detection by Nearby Objects Hallucination"

NOH-NMS: Improving Pedestrian Detection by Nearby Objects Hallucination The offical implementation for the "NOH-NMS: Improving Pedestrian Detection by

Tencent YouTu Research 64 Nov 11, 2022
Distilling Motion Planner Augmented Policies into Visual Control Policies for Robot Manipulation (CoRL 2021)

Distilling Motion Planner Augmented Policies into Visual Control Policies for Robot Manipulation [Project website] [Paper] This project is a PyTorch i

Cognitive Learning for Vision and Robotics (CLVR) lab @ USC 6 Feb 28, 2022
FluidNet re-written with ATen tensor lib

fluidnet_cxx: Accelerating Fluid Simulation with Convolutional Neural Networks. A PyTorch/ATen Implementation. This repository is based on the paper,

JoliBrain 50 Jun 07, 2022
SpeechNAS Better Trade off between Latency and Accuracy for Large Scale Speaker Verification

SpeechNAS Better Trade off between Latency and Accuracy for Large Scale Speaker Verification

Wentao Zhu 24 May 20, 2022
Self-Supervised Collision Handling via Generative 3D Garment Models for Virtual Try-On

Self-Supervised Collision Handling via Generative 3D Garment Models for Virtual Try-On [Project website] [Dataset] [Video] Abstract We propose a new g

71 Dec 24, 2022
SuMa++: Efficient LiDAR-based Semantic SLAM (Chen et al IROS 2019)

SuMa++: Efficient LiDAR-based Semantic SLAM This repository contains the implementation of SuMa++, which generates semantic maps only using three-dime

Photogrammetry & Robotics Bonn 701 Dec 30, 2022
GPU Programming with Julia - course at the Swiss National Supercomputing Centre (CSCS), ETH Zurich

Course Description The programming language Julia is being more and more adopted in High Performance Computing (HPC) due to its unique way to combine

Samuel Omlin 192 Jan 03, 2023
Fast and accurate optimisation for registration with little learningconvexadam

convexAdam Learn2Reg 2021 Submission Fast and accurate optimisation for registration with little learning Excellent results on Learn2Reg 2021 challeng

17 Dec 06, 2022
Rethinking the Importance of Implementation Tricks in Multi-Agent Reinforcement Learning

RIIT Our open-source code for RIIT: Rethinking the Importance of Implementation Tricks in Multi-AgentReinforcement Learning. We implement and standard

405 Jan 06, 2023
[TIP2020] Adaptive Graph Representation Learning for Video Person Re-identification

Introduction This is the PyTorch implementation for Adaptive Graph Representation Learning for Video Person Re-identification. Get started git clone h

WuYiming 41 Dec 12, 2022
Self-Guided Contrastive Learning for BERT Sentence Representations

Self-Guided Contrastive Learning for BERT Sentence Representations This repository is dedicated for releasing the implementation of the models utilize

Taeuk Kim 16 Dec 04, 2022
Byte-based multilingual transformer TTS for low-resource/few-shot language adaptation.

One model to speak them all 🌎 Audio Language Text ▷ Chinese 人人生而自由,在尊严和权利上一律平等。 ▷ English All human beings are born free and equal in dignity and rig

Mutian He 60 Nov 14, 2022
Like Dirt-Samples, but cleaned up

Clean-Samples Like Dirt-Samples, but cleaned up, with clear provenance and license info (generally a permissive creative commons licence but check the

TidalCycles 39 Nov 30, 2022
PyTorch implementation of "Contrast to Divide: self-supervised pre-training for learning with noisy labels"

Contrast to Divide: self-supervised pre-training for learning with noisy labels This is an official implementation of "Contrast to Divide: self-superv

55 Nov 23, 2022
smc.covid is an R package related to the paper A sequential Monte Carlo approach to estimate a time varying reproduction number in infectious disease models: the COVID-19 case by Storvik et al

smc.covid smc.covid is an R package related to the paper A sequential Monte Carlo approach to estimate a time varying reproduction number in infectiou

0 Oct 15, 2021
Disease Informed Neural Networks (DINNs) — neural networks capable of learning how diseases spread, forecasting their progression, and finding their unique parameters (e.g. death rate).

DINN We introduce Disease Informed Neural Networks (DINNs) — neural networks capable of learning how diseases spread, forecasting their progression, a

19 Dec 10, 2022
A bare-bones Python library for quality diversity optimization.

pyribs Website Source PyPI Conda CI/CD Docs Docs Status Twitter pyribs.org GitHub docs.pyribs.org A bare-bones Python library for quality diversity op

ICAROS 127 Jan 06, 2023
A library for efficient similarity search and clustering of dense vectors.

Faiss Faiss is a library for efficient similarity search and clustering of dense vectors. It contains algorithms that search in sets of vectors of any

Meta Research 18.8k Jan 08, 2023
This repo is customed for VisDrone.

Object Detection for VisDrone(无人机航拍图像目标检测) My environment 1、Windows10 (Linux available) 2、tensorflow = 1.12.0 3、python3.6 (anaconda) 4、cv2 5、ensemble

53 Jul 17, 2022
Negative Sample Matters: A Renaissance of Metric Learning for Temporal Grounding

2D-TAN (Optimized) Introduction This is an optimized re-implementation repository for AAAI'2020 paper: Learning 2D Temporal Localization Networks for

Joya Chen 112 Dec 31, 2022