Offline Multi-Agent Reinforcement Learning Implementations: Solving Overcooked Game with Data-Driven Method

Last update: Sep 16, 2022

Related tags

Overview

Overcooked-AI

We suppose to apply traditional offline reinforcement learning technique to multi-agent algorithm.
In this repository, we implemented behavior cloning(BC), offline MADDPG, MADDPG+REM (MADDPG w/ REM), MADDPG+BCQ (MADDPG w/ BCQ) with pytorch. Now, BCQ is in ' Working In Progress', and it's not implemented completely.

We collected 0.5M multi-agent offline RL dataset and experimented with each comparison methods. We collected this data with online MADDPG agents, and it includes exploration trajectories using OU noise. The experiments are ran on Asymmetric Advantages on the Overcooked environment.

We are looking forward your contribution!

How to Run

Collect Offline Data

python train_online.py agent=maddpg save_replay_buffer=true

While the agents train with 0.5M steps, the trajectory replay buffer will be dumped in your experiment/{date}/{time}_maddpg_{exp_name}/buffer folder.
Please replace the path in config/data/local.yaml to the experiment by-product directory.

Download Dataset

Or, if you want to use our dataset pre-collected, please enjoy this link.
We provide 0.5M trajectories in Asymmetric Advantages layout.
Please download our dataset in your local computer and replace the path in config/data/local.yaml

Train Offline Models

Behavior Cloning

python train_bc.py agent=bc data=local

Offline MADDPG (Vanilla)

python train_offline.py agent=maddpg data=local

Offline MADDPG (w/ REM)

python train_offline.py agent=rem_maddpg data=local

Offline MADDPG (w/ BCQ) (WIP)

python train_offline.py agent=bcq_maddpg data=local

Result

Graph

Online	Offline (0.5M Data)	Offline (0.25M Data)

Video

Online	BC	Offline /w REM

Offline Multi-Agent Reinforcement Learning Implementations: Solving Overcooked Game with Data-Driven Method

Related tags

Overview

Overcooked-AI

How to Run

Collect Offline Data

Download Dataset

Train Offline Models

Behavior Cloning

Offline MADDPG (Vanilla)

Offline MADDPG (w/ REM)

Offline MADDPG (w/ BCQ) (WIP)

Result

Graph

Video

Acknowledgement

Owner

Baek In-Chang

TensorFlow code for the neural network presented in the paper: "Structural Language Models of Code" (ICML'2020)

[CVPR2021] UAV-Human: A Large Benchmark for Human Behavior Understanding with Unmanned Aerial Vehicles

Semi-supervised Semantic Segmentation with Directional Context-aware Consistency (CVPR 2021)

Semi-supervised Transfer Learning for Image Rain Removal. In CVPR 2019.

Implementation of Vaswani, Ashish, et al. "Attention is all you need."

4th place solution for the SIGIR 2021 challenge.

Source code of generalized shuffled linear regression

Patch Rotation: A Self-Supervised Auxiliary Task for Robustness and Accuracy of Supervised Models

ReConsider is a re-ranking model that re-ranks the top-K (passage, answer-span) predictions of an Open-Domain QA Model like DPR (Karpukhin et al., 2020).

Run Keras models in the browser, with GPU support using WebGL

CFC-Net: A Critical Feature Capturing Network for Arbitrary-Oriented Object Detection in Remote Sensing Images

Tutorial: Introduction to Graph Machine Learning, with Jupyter notebooks

GPU-Accelerated Deep Learning Library in Python

TorchMD-Net provides state-of-the-art graph neural networks and equivariant transformer neural networks potentials for learning molecular potentials

IRON Kaggle project done while doing IRONHACK Bootcamp where we had to analyze and use a Machine Learning Project to predict future sales

A PyTorch implementation of PointRend: Image Segmentation as Rendering

This is a code repository for paper OODformer: Out-Of-Distribution Detection Transformer

For IBM Quantum Challenge Africa 2021, 9 September (07:00 UTC) - 20 September (23:00 UTC).

ROS support for Velodyne 3D LIDARs

Face Depixelizer based on "PULSE: Self-Supervised Photo Upsampling via Latent Space Exploration of Generative Models" repository.