Evolutionary Population Curriculum for Scaling Multi-Agent Reinforcement Learning

Overview

Evolutionary Population Curriculum for Scaling Multi-Agent Reinforcement Learning

This is the code for implementing the MADDPG algorithm presented in the paper: Evolutionary Population Curriculum for Scaling Multi-Agent Reinforcement Learning. It is configured to be run in conjunction with environments from the (https://github.com/qian18long/epciclr2020/tree/master/mpe_local). We show our gif results here (https://sites.google.com/view/epciclr2020/). Note: this codebase has been restructured since the original paper, and the results may vary from those reported in the paper.

Installation

  • Install tensorflow 1.13.1
pip install tensorflow==1.13.1
  • Install OpenAI gym
pip install gym==0.13.0
  • Install other dependencies
pip install joblib imageio

Case study: Multi-Agent Particle Environments

We demonstrate here how the code can be used in conjunction with the(https://github.com/qian18long/epciclr2020/tree/master/mpe_local). It is based on(https://github.com/openai/multiagent-particle-envs)

Quick start

  • See train_grassland_epc.sh, train_adversarial_epc.sh and train_food_collect_epc.sh for the EPC algorithm for scenario grassland, adversarial and food_collect in the example setting presented in our paper.

Command-line options

Environment options

  • --scenario: defines which environment in the MPE is to be used (default: "grassland")

  • --map-size: The size of the environment. 1 if normal and 2 otherwise. (default: "normal")

  • --sight: The agent's visibility radius. (default: 100)

  • --alpha: Reward shared weight. (default: 0.0)

  • --max-episode-len maximum length of each episode for the environment (default: 25)

  • --num-episodes total number of training episodes (default: 200000)

  • --num-good: number of good agents in the scenario (default: 2)

  • --num-adversaries: number of adversaries in the environment (default: 2)

  • --num-food: number of food(resources) in the scenario (default: 4)

  • --good-policy: algorithm used for the 'good' (non adversary) policies in the environment (default: "maddpg"; options: {"att-maddpg", "maddpg", "PC", "mean-field"})

  • --adv-policy: algorithm used for the adversary policies in the environment (default: "maddpg"; options: {"att-maddpg", "maddpg", "PC", "mean-field"})

Core training parameters

  • --lr: learning rate (default: 1e-2)

  • --gamma: discount factor (default: 0.95)

  • --batch-size: batch size (default: 1024)

  • --num-units: number of units in the MLP (default: 64)

  • --good-num-units: number of units in the MLP of good agents, if not providing it will be num-units.

  • --adv-num-units: number of units in the MLP of adversarial agents, if not providing it will be num-units.

  • --n_cpu_per_agent: cpu usage per agent (default: 1)

  • --good-share-weights: good agents share weights of the agents encoder within the model.

  • --adv-share-weights: adversarial agents share weights of the agents encoder within the model.

  • --use-gpu: Use GPU for training (default: False)

  • --n-envs: number of environments instances in parallelization

Checkpointing

  • --save-dir: directory where intermediate training results and model will be saved (default: "/test/")

  • --save-rate: model is saved every time this number of episodes has been completed (default: 1000)

  • --load-dir: directory where training state and model are loaded from (default: "test")

Evaluation

  • --restore: restores previous training state stored in load-dir (or in save-dir if no load-dir has been provided), and continues training (default: False)

  • --display: displays to the screen the trained policy stored in load-dir (or in save-dir if no load-dir has been provided), but does not continue training (default: False)

  • --save-gif-data: Save the gif examples to the save-dir (default: False)

  • --render-gif: Render the gif in the load-dir (default: False)

EPC options

  • --initial-population: initial population size in the first stage

  • --num-selection: size of the population selected for reproduction

  • --num-stages: number of stages

  • --stage-num-episodes: number of training episodes in each stage

  • --stage-n-envs: number of environments instances in parallelization in each stage

  • --test-num-episodes: number of episodes for the competing

Example scripts

  • .maddpg_o/experiments/train_normal.py: apply the train_helpers.py for MADDPG, Att-MADDPG and mean-field training
  • .maddpg_o/experiments/train_x2.py: apply a single step doubling training

  • .maddpg_o/experiments/train_mix_match.py: mix match of the good agents in --sheep-init-load-dirs and adversarial agents in '--wolf-init-load-dirs' for model agents evaluation.

  • .maddpg_o/experiments/train_epc.py: train the scheduled EPC algorithm.

  • .maddpg_o/experiments/compete.py: evaluate different models by competition

Paper citation

@inproceedings{epciclr2020,
  author = {Qian Long and Zihan Zhou and Abhinav Gupta and Fei Fang and Yi Wu and Xiaolong Wang},
  title = {Evolutionary Population Curriculum for Scaling Multi-Agent Reinforcement Learning},
  booktitle = {International Conference on Learning Representations},
  year = {2020}
}
Deep Sea Treasure Environment for Multi-Objective Optimization Research

DeepSeaTreasure Environment Installation In order to get started with this environment, you can install it using the following command: python3 -m pip

imec IDLab 6 Nov 14, 2022
Cross-Modal Contrastive Learning for Text-to-Image Generation

Cross-Modal Contrastive Learning for Text-to-Image Generation This repository hosts the open source JAX implementation of XMC-GAN. Setup instructions

Google Research 94 Nov 12, 2022
DCGAN-tensorflow - A tensorflow implementation of Deep Convolutional Generative Adversarial Networks

DCGAN in Tensorflow Tensorflow implementation of Deep Convolutional Generative Adversarial Networks which is a stabilize Generative Adversarial Networ

Taehoon Kim 7.1k Dec 29, 2022
Codes and Data Processing Files for our paper.

Code Scripts and Processing Files for EEG Sleep Staging Paper 1. Folder Tree ./src_preprocess (data preprocessing files for SHHS and Sleep EDF) sleepE

Chaoqi Yang 18 Dec 12, 2022
Implementation of "A Deep Learning Loss Function based on Auditory Power Compression for Speech Enhancement" by pytorch

This repository is used to suspend the results of our paper "A Deep Learning Loss Function based on Auditory Power Compression for Speech Enhancement"

ScorpioMiku 19 Sep 30, 2022
A Deep Reinforcement Learning Framework for Stock Market Trading

DQN-Trading This is a framework based on deep reinforcement learning for stock market trading. This project is the implementation code for the two pap

61 Jan 01, 2023
Bayesian Optimization using GPflow

Note: This package is for use with GPFlow 1. For Bayesian optimization using GPFlow 2 please see Trieste, a joint effort with Secondmind. GPflowOpt GP

GPflow 257 Dec 26, 2022
MIM: MIM Installs OpenMMLab Packages

MIM provides a unified API for launching and installing OpenMMLab projects and their extensions, and managing the OpenMMLab model zoo.

OpenMMLab 254 Jan 04, 2023
RCDNet: A Model-driven Deep Neural Network for Single Image Rain Removal (CVPR2020)

RCDNet: A Model-driven Deep Neural Network for Single Image Rain Removal (CVPR2020) Hong Wang, Qi Xie, Qian Zhao, and Deyu Meng [PDF] [Supplementary M

Hong Wang 6 Sep 27, 2022
《Dual-Resolution Correspondence Network》(NeurIPS 2020)

Dual-Resolution Correspondence Network Dual-Resolution Correspondence Network, NeurIPS 2020 Dependency All dependencies are included in asset/dualrcne

Active Vision Laboratory 45 Nov 21, 2022
An elaborate and exhaustive paper list for Named Entity Recognition (NER)

Named-Entity-Recognition-NER-Papers by Pengfei Liu, Jinlan Fu and other contributors. An elaborate and exhaustive paper list for Named Entity Recognit

Pengfei Liu 388 Dec 18, 2022
PyTorch implementations of the paper: "DR.VIC: Decomposition and Reasoning for Video Individual Counting, CVPR, 2022"

DRNet for Video Indvidual Counting (CVPR 2022) Introduction This is the official PyTorch implementation of paper: DR.VIC: Decomposition and Reasoning

tao han 35 Nov 22, 2022
TensorFlow Implementation of Unsupervised Cross-Domain Image Generation

Domain Transfer Network (DTN) TensorFlow implementation of Unsupervised Cross-Domain Image Generation. Requirements Python 2.7 TensorFlow 0.12 Pickle

Yunjey Choi 864 Dec 30, 2022
🧑‍🔬 verify your TEAL program by experiment and observation

Graviton - Testing TEAL with Dry Runs Tutorial Local Installation The following instructions assume that you have make available in your local environ

Algorand 18 Jan 03, 2023
An Implementation of SiameseRPN with Feature Pyramid Networks

SiameseRPN with FPN This project is mainly based on HelloRicky123/Siamese-RPN. What I've done is just add a Feature Pyramid Network method to the orig

3 Apr 16, 2022
Koç University deep learning framework.

Knet Knet (pronounced "kay-net") is the Koç University deep learning framework implemented in Julia by Deniz Yuret and collaborators. It supports GPU

1.4k Dec 31, 2022
Official code for "Mean Shift for Self-Supervised Learning"

MSF Official code for "Mean Shift for Self-Supervised Learning" Requirements Python = 3.7.6 PyTorch = 1.4 torchvision = 0.5.0 faiss-gpu = 1.6.1 In

UMBC Vision 44 Nov 21, 2022
A small demonstration of using WebDataset with ImageNet and PyTorch Lightning

A small demonstration of using WebDataset with ImageNet and PyTorch Lightning

Tom 50 Dec 16, 2022
Code for "Long-tailed Distribution Adaptation"

Long-tailed Distribution Adaptation (Accepted in ACM MM2021) This project is built upon BBN. Installation pip install -r requirements.txt Usage Traini

Zhiliang Peng 10 May 18, 2022
Fast Neural Representations for Direct Volume Rendering

Fast Neural Representations for Direct Volume Rendering Sebastian Weiss, Philipp Hermüller, Rüdiger Westermann This repository contains the code and s

Sebastian Weiss 20 Dec 03, 2022