PantheonRL is a package for training and testing multi-agent reinforcement learning environments.

Last update: Dec 28, 2022

Overview

PantheonRL

PantheonRL is a package for training and testing multi-agent reinforcement learning environments. The goal of PantheonRL is to provide a modular and extensible framework for training agent policies, fine-tuning agent policies, ad-hoc pairing of agents, and more. PantheonRL also provides a web user interface suitable for lightweight experimentation and prototyping.

PantheonRL is built on top of StableBaselines3 (SB3), allowing direct access to many of SB3's standard RL training algorithms such as PPO. PantheonRL currently follows a decentralized training paradigm -- each agent is equipped with its own replay buffer and update algorithm. The agents objects are designed to be easily manipulable. They can be saved, loaded and plugged into different training procedures such as self-play, ad-hoc / cross-play, round-robin training, or finetuning.

This package will be presented as a demo at the AAAI-22 Demonstrations Program.

Demo Paper

Demo Video

"PantheonRL: A MARL Library for Dynamic Training Interactions"
Bidipta Sarkar*, Aditi Talati*, Andy Shih*, Dorsa Sadigh
In Proceedings of the 36th AAAI Conference on Artificial Intelligence (Demo Track), 2022

@inproceedings{sarkar2021pantheonRL,
  title={PantheonRL: A MARL Library for Dynamic Training Interactions},
  author={Sarkar, Bidipta and Talati, Aditi and Shih, Andy and Sadigh Dorsa},
  booktitle = {Proceedings of the 36th AAAI Conference on Artificial Intelligence (Demo Track)},
  year={2022}
}

Installation

# Optionally create conda environments
conda create -n PantheonRL python=3.7
conda activate PantheonRL

# Clone and install PantheonRL
git clone https://github.com/Stanford-ILIAD/PantheonRL.git
cd PantheonRL
pip install -e .

Overcooked Installation

# Optionally install Overcooked environment
git submodule update --init --recursive
pip install -e overcookedgym/human_aware_rl/overcooked_ai

PettingZoo Installation

# Optionally install PettingZoo environments
pip install pettingzoo

# to install a group of pettingzoo environments
pip install "pettingzoo[classic]"

Command Line Invocation

Example

python3 trainer.py LiarsDice-v0 PPO PPO --seed 10 --preset 1

# requires Overcooked installation (see above instructions)
python3 trainer.py OvercookedMultiEnv-v0 PPO PPO --env-config '{"layout_name":"simple"}' --seed 10 --preset 1

For examples on round-robin training followed by partner adaptation, check out these instructions.

For more examples, check out the examples/ directory.

Web User Interface

The first time the web interface is being run in a new location, the database must be initialized. After that, the init-db command should not be called again, because this will clear all user account data.

Set environment variables and (re)inititalize the database

export FLASK_APP=website
export FLASK_ENV=development
flask init-db

Start the web user interface. Make sure that ports 5000 and 5001 (used for Tensorboard) are not taken.

flask run --host=0.0.0.0 --port=5000

Agent selection screen. Users can customize the ego and partner agents.

Training screen. Users can view basic information, or spawn a Tensorboard tab for full monitoring.

Features

General Features	PantheonRL
Documentation	✔️
Web user interface	✔️
Built on top of SB3	✔️
Supports PettingZoo Envs	✔️

Environment Features	PantheonRL
Frame stacking (recurrence)	✔️
Simultaneous multiagent envs	✔️
Turn-based multiagent envs	✔️
2-player envs	✔️
N-player envs	✔️
Custom environments	✔️

Training Features	PantheonRL
Self-play	✔️
Ad-hoc / cross-play	✔️
Round-robin training	✔️
Finetune / adapt to new partners	✔️
Custom policies	✔️

Current Environments

Name	Environment Type	Reward Type	Players	Visualization
Rock Paper Scissors	SimultaneousEnv	Competitive	2	❌
Liar's Dice	TurnBasedEnv	Competitive	2	❌
Block World [1]	TurnBasedEnv	Cooperative	2	✔️
Overcooked [2]	SimultaneousEnv	Cooperative	2	✔️
PettingZoo [3]	Mixed	Mixed	N	✔️

[1] Adapted from the block construction task from https://github.com/cogtoolslab/compositional-abstractions

[2] Adapted from the Human_Aware_Rl / Overcooked AI package from https://github.com/HumanCompatibleAI/human_aware_rl

[3] PettingZoo environments from https://github.com/Farama-Foundation/PettingZoo

PantheonRL is a package for training and testing multi-agent reinforcement learning environments.

Related tags

Overview

PantheonRL

Installation

Overcooked Installation

PettingZoo Installation

Command Line Invocation

Example

Web User Interface

Features

Current Environments

Owner

Stanford Intelligent and Interactive Autonomous Systems Group

New approach to benchmark VQA models

My Body is a Cage: the Role of Morphology in Graph-Based Incompatible Control

Numerical-computing-is-fun - Learning numerical computing with notebooks for all ages.

PyTorch implementation for NED. It can be used to manipulate the facial emotions of actors in videos based on emotion labels or reference styles.

Everything about being a TA for ITP/AP course!

Deploy optimized transformer based models on Nvidia Triton server

GNNAdvisor: An Efficient Runtime System for GNN Acceleration on GPUs

Implementation of GGB color space

An official TensorFlow implementation of “CLCC: Contrastive Learning for Color Constancy” accepted at CVPR 2021.

Voxel Transformer for 3D object detection

This is a Python wrapper for TA-LIB based on Cython instead of SWIG.

ICML 21 - Voice2Series: Reprogramming Acoustic Models for Time Series Classification

YOLOX is a high-performance anchor-free YOLO, exceeding yolov3~v5 with ONNX, TensorRT, ncnn, and OpenVINO supported.

StyleGAN of All Trades: Image Manipulation withOnly Pretrained StyleGAN

AugLy is a data augmentations library that currently supports four modalities (audio, image, text & video) and over 100 augmentations

QAT(quantize aware training) for classification with MQBench

These are the materials for the paper "Few-Shot Out-of-Domain Transfer Learning of Natural Language Explanations"

THIS IS THE OLD PYMC PROJECT. PLEASE USE PYMC3 INSTEAD:

A check for whether the dependency jobs are all green.

A deep neural networks for images using CNN algorithm.

PantheonRL is a package for training and testing multi-agent reinforcement learning environments.

Related tags

Overview

PantheonRL

Installation

Overcooked Installation

PettingZoo Installation

Command Line Invocation

Example

Web User Interface

Features

Current Environments

Owner

Stanford Intelligent and Interactive Autonomous Systems Group

New approach to benchmark VQA models

My Body is a Cage: the Role of Morphology in Graph-Based Incompatible Control

Numerical-computing-is-fun - Learning numerical computing with notebooks for all ages.

PyTorch implementation for NED. It can be used to manipulate the facial emotions of actors in videos based on emotion labels or reference styles.

Everything about being a TA for ITP/AP course!

Deploy optimized transformer based models on Nvidia Triton server

GNNAdvisor: An Efficient Runtime System for GNN Acceleration on GPUs

Implementation of GGB color space

An official TensorFlow implementation of “CLCC: Contrastive Learning for Color Constancy” accepted at CVPR 2021.

Voxel Transformer for 3D object detection

This is a Python wrapper for TA-LIB based on Cython instead of SWIG.

ICML 21 - Voice2Series: Reprogramming Acoustic Models for Time Series Classification

YOLOX is a high-performance anchor-free YOLO, exceeding yolov3~v5 with ONNX, TensorRT, ncnn, and OpenVINO supported.

StyleGAN of All Trades: Image Manipulation withOnly Pretrained StyleGAN

AugLy is a data augmentations library that currently supports four modalities (audio, image, text & video) and over 100 augmentations

QAT(quantize aware training) for classification with MQBench

These are the materials for the paper "Few-Shot Out-of-Domain Transfer Learning of Natural Language Explanations"

THIS IS THE **OLD** PYMC PROJECT. PLEASE USE PYMC3 INSTEAD:

A check for whether the dependency jobs are all green.

A deep neural networks for images using CNN algorithm.

THIS IS THE OLD PYMC PROJECT. PLEASE USE PYMC3 INSTEAD: