Adaptable tools to make reinforcement learning and evolutionary computation algorithms.

Overview

pipeline status codecov codestyle

Pearl

The Parallel Evolutionary and Reinforcement Learning Library (Pearl) is a pytorch based package with the goal of being excellent for rapid prototyping of new adaptive decision making algorithms in the intersection between reinforcement learning (RL) and evolutionary computation (EC). As such, this is not intended to provide template pre-built algorithms as a baseline, but rather flexible tools to allow the user to quickly build and test their own implementations and ideas. A technical report can be found here.

Main Features

Features Pearl
RL algorithms (e.g. Actor Critic) ✔️
EC algorithms (e.g. Genetic Algorithm) ✔️
Hybrid algorithms (e.g. CEM-DDPG) ✔️
Multi-agent suppport ✔️
Tensorboard integration ✔️
Modular and extensible components ✔️
Opinionated module settings ✔️
Custom callbacks ✔️

User Guide

Installation

There are two options to install this package:

  1. pip install pearll
  2. git clone [email protected]:LondonNode/Pearl.git

Module Guide

  • agents: implementations of RL and EC agents where the other modular components are put together
  • buffers: these handle storing and sampling of trajectories
  • callbacks: inject logic for every step made in an environment (e.g. save model, early stopping)
  • common: common methods applicable to all other modules (e.g. enumerations) and a main utils.py file with some useful general logic
  • explorers: action explorers for enhanced exploration by adding noise to actions and random exploration for first n steps
  • models: neural network structures which are structured as encoder -> torso -> head
  • signal_processing: signal processing logic for extra modularity (e.g. TD returns, GAE)
  • updaters: update neural networks and adaptive/iterative algorithms
  • settings.py: settings objects for the above components, can be extended for custom components

Agent Templates

See pearll/agents/templates.py for the templates to create your own agents! For more examples, see specific agent implementations under pearll/agents.

Agent Performance

To see training performance, use the command tensorboard --logdir runs or tensorboard --logdir <tensorboard_log_path> defined in your algorithm class initialization.

Python Scripts

To run these you'll need to go to wherever the library is installed, cd pearll.

  • demo.py: script to run very basic demos of agents with pre-defined hyperparameters, run python3 -m pearll.demo -h for more info
  • plot.py: script to plot more complex plots that can't be obtained via Tensorboard (e.g. multiple subplots), run python3 -m pearll.plot -h for more info

Developer Guide

Scripts

Linux

  1. scripts/setup_dev.sh: setup your virtual environment
  2. scripts/run_tests.sh: run tests

Windows

  1. scripts/windows_setup_dev.bat: setup your virtual environment
  2. scripts/windows_run_tests.bat: run tests

Dependency Management

Pearl uses poetry for dependency management and build release instead of pip. As a quick guide:

  1. Run poetry add [package] to add more package dependencies.
  2. Poetry automatically handles the virtual environment used, check pyproject.toml for specifics on the virtual environment setup.
  3. If you want to run something in the poetry virtual environment, add poetry run as a prefix to the command you want to execute. For example, to run a python file: poetry run python3 script.py.

Credit

Citing Pearl

@misc{tangri2022pearl,
      title={Pearl: Parallel Evolutionary and Reinforcement Learning Library}, 
      author={Rohan Tangri and Danilo P. Mandic and Anthony G. Constantinides},
      year={2022},
      eprint={2201.09568},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

Acknowledgements

Pearl was inspired by Stable Baselines 3 and Tonic

You might also like...
BESS: Balanced Evolutionary Semi-Stacking for Disease Detection via Partially Labeled Imbalanced Tongue Data

Balanced-Evolutionary-Semi-Stacking Code for the paper ''BESS: Balanced Evolutionary Semi-Stacking for Disease Detection via Partially Labeled Imbalan

Systemic Evolutionary Chemical Space Exploration for Drug Discovery
Systemic Evolutionary Chemical Space Exploration for Drug Discovery

SECSE SECSE: Systemic Evolutionary Chemical Space Explorer Chemical space exploration is a major task of the hit-finding process during the pursuit of

Deep learning with dynamic computation graphs in TensorFlow
Deep learning with dynamic computation graphs in TensorFlow

TensorFlow Fold TensorFlow Fold is a library for creating TensorFlow models that consume structured data, where the structure of the computation graph

A toolkit for developing and comparing reinforcement learning algorithms.

Status: Maintenance (expect bug fixes and minor updates) OpenAI Gym OpenAI Gym is a toolkit for developing and comparing reinforcement learning algori

PyTorch implementations of deep reinforcement learning algorithms and environments
PyTorch implementations of deep reinforcement learning algorithms and environments

Deep Reinforcement Learning Algorithms with PyTorch This repository contains PyTorch implementations of deep reinforcement learning algorithms and env

Pytorch implementations of popular off-policy multi-agent reinforcement learning algorithms, including QMix, VDN, MADDPG, and MATD3.

Off-Policy Multi-Agent Reinforcement Learning (MARL) Algorithms This repository contains implementations of various off-policy multi-agent reinforceme

Reinforcement learning framework and algorithms implemented in PyTorch.

Reinforcement learning framework and algorithms implemented in PyTorch.

Independent and minimal implementations of some reinforcement learning algorithms using PyTorch (including PPO, A3C, A2C, ...).

PyTorch RL Minimal Implementations There are implementations of some reinforcement learning algorithms, whose characteristics are as follow: Less pack

PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.

PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.

Comments
  • Bump pillow from 9.0.0 to 9.0.1

    Bump pillow from 9.0.0 to 9.0.1

    Bumps pillow from 9.0.0 to 9.0.1.

    Release notes

    Sourced from pillow's releases.

    9.0.1

    https://pillow.readthedocs.io/en/stable/releasenotes/9.0.1.html

    Changes

    • In show_file, use os.remove to remove temporary images. CVE-2022-24303 #6010 [@​radarhere, @​hugovk]
    • Restrict builtins within lambdas for ImageMath.eval. CVE-2022-22817 #6009 [radarhere]
    Changelog

    Sourced from pillow's changelog.

    9.0.1 (2022-02-03)

    • In show_file, use os.remove to remove temporary images. CVE-2022-24303 #6010 [radarhere, hugovk]

    • Restrict builtins within lambdas for ImageMath.eval. CVE-2022-22817 #6009 [radarhere]

    Commits
    • 6deac9e 9.0.1 version bump
    • c04d812 Update CHANGES.rst [ci skip]
    • 4fabec3 Added release notes for 9.0.1
    • 02affaa Added delay after opening image with xdg-open
    • ca0b585 Updated formatting
    • 427221e In show_file, use os.remove to remove temporary images
    • c930be0 Restrict builtins within lambdas for ImageMath.eval
    • 75b69dd Dont need to pin for GHA
    • cd938a7 Autolink CWE numbers with sphinx-issues
    • 2e9c461 Add CVE IDs
    • See full diff in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 1
  • Feature/hybrid

    Feature/hybrid

    Overhaul models and base agent structure to accommodate RL, MARL, EC in optimizing static functions and RL environments and hybrid algorithms combining RL and EC.

    opened by 09tangriro 1
  • MORE AGENTS

    MORE AGENTS

    The more agents created the better proof that the tools underlying work as intended.

    Agents should be tested on particular environments to ensure performance.

    feature good first issue 
    opened by 09tangriro 0
Releases(v0.4.1)
A practical ML pipeline for data labeling with experiment tracking using DVC.

Auto Label Pipeline A practical ML pipeline for data labeling with experiment tracking using DVC Goals: Demonstrate reproducible ML Use DVC to build a

Todd Cook 4 Mar 08, 2022
Lua-parser-lark - An out-of-box Lua parser written in Lark

An out-of-box Lua parser written in Lark Such parser handles a relaxed version o

Taine Zhao 2 Jul 19, 2022
Codes for 'Dual Parameterization of Sparse Variational Gaussian Processes'

Dual Parameterization of Sparse Variational Gaussian Processes Documentation | Notebooks | API reference Introduction This repository is the official

AaltoML 7 Dec 23, 2022
Pytorch and Torch testing code of CartoonGAN

CartoonGAN-Test-Pytorch-Torch Pytorch and Torch testing code of CartoonGAN [Chen et al., CVPR18]. With the released pretrained models by the authors,

Yijun Li 642 Dec 27, 2022
A set of simple scripts to process the Imagenet-1K dataset as TFRecords and make index files for NVIDIA DALI.

Overview This is a set of simple scripts to process the Imagenet-1K dataset as TFRecords and make index files for NVIDIA DALI. Make TFRecords To run t

8 Nov 01, 2022
This is the repository for CVPR2021 Dynamic Metric Learning: Towards a Scalable Metric Space to Accommodate Multiple Semantic Scales

Intro This is the repository for CVPR2021 Dynamic Metric Learning: Towards a Scalable Metric Space to Accommodate Multiple Semantic Scales Vehicle Sam

39 Jul 21, 2022
A Transformer-Based Feature Segmentation and Region Alignment Method For UAV-View Geo-Localization

University1652-Baseline [Paper] [Slide] [Explore Drone-view Data] [Explore Satellite-view Data] [Explore Street-view Data] [Video Sample] [中文介绍] This

Zhedong Zheng 335 Jan 06, 2023
A collection of semantic image segmentation models implemented in TensorFlow

A collection of semantic image segmentation models implemented in TensorFlow. Contains data-loaders for the generic and medical benchmark datasets.

bobby 16 Dec 06, 2019
A diff tool for language models

LMdiff Qualitative comparison of large language models. Demo & Paper: http://lmdiff.net LMdiff is a MIT-IBM Watson AI Lab collaboration between: Hendr

Hendrik Strobelt 27 Dec 29, 2022
Autonomous Robots Kalman Filters

Autonomous Robots Kalman Filters The Kalman Filter is an easy topic. However, ma

20 Jul 18, 2022
PuppetGAN - Cross-Domain Feature Disentanglement and Manipulation just got way better! 🚀

Better Cross-Domain Feature Disentanglement and Manipulation with Improved PuppetGAN Quite cool... Right? Introduction This repo contains a TensorFlow

Giorgos Karantonis 5 Aug 25, 2022
Safe Policy Optimization with Local Features

Safe Policy Optimization with Local Feature (SPO-LF) This is the source-code for implementing the algorithms in the paper "Safe Policy Optimization wi

Akifumi Wachi 6 Jun 05, 2022
Material for my PyConDE & PyData Berlin 2022 Talk "5 Steps to Speed Up Your Data-Analysis on a Single Core"

5 Steps to Speed Up Your Data-Analysis on a Single Core Material for my talk at the PyConDE & PyData Berlin 2022 Description Your data analysis pipeli

Jonathan Striebel 9 Dec 12, 2022
GazeScroller - Using Facial Movements to perform Hands-free Gesture on the system

GazeScroller Using Facial Movements to perform Hands-free Gesture on the system

2 Jan 05, 2022
Self-Supervised Monocular DepthEstimation with Internal Feature Fusion(arXiv), BMVC2021

DIFFNet This repo is for Self-Supervised Monocular Depth Estimation with Internal Feature Fusion(arXiv), BMVC2021 A new backbone for self-supervised d

Hang 94 Dec 25, 2022
Optimal Adaptive Allocation using Deep Reinforcement Learning in a Dose-Response Study

Optimal Adaptive Allocation using Deep Reinforcement Learning in a Dose-Response Study Supplementary Materials for Kentaro Matsuura, Junya Honda, Imad

Kentaro Matsuura 4 Nov 01, 2022
codebase for "A Theory of the Inductive Bias and Generalization of Kernel Regression and Wide Neural Networks"

Eigenlearning This repo contains code for replicating the experiments of the paper A Theory of the Inductive Bias and Generalization of Kernel Regress

Jamie Simon 45 Dec 02, 2022
A working implementation of the Categorical DQN (Distributional RL).

Categorical DQN. Implementation of the Categorical DQN as described in A distributional Perspective on Reinforcement Learning. Thanks to @tudor-berari

Florin Gogianu 98 Sep 20, 2022
PyTorch implementation of NIPS 2017 paper Dynamic Routing Between Capsules

Dynamic Routing Between Capsules - PyTorch implementation PyTorch implementation of NIPS 2017 paper Dynamic Routing Between Capsules from Sara Sabour,

Adam Bielski 475 Dec 24, 2022
Megaverse is a new 3D simulation platform for reinforcement learning and embodied AI research

Megaverse Megaverse is a new 3D simulation platform for reinforcement learning and embodied AI research. The efficient design of the engine enables ph

Aleksei Petrenko 191 Dec 23, 2022