Official Implementation of 'UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers' ICLR 2021(spotlight)

Related tags

Deep LearningUPDeT
Overview

UPDeT

Official Implementation of UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers (ICLR 2021 spotlight)

The framework is inherited from PyMARL. UPDeT is written in pytorch and uses SMAC as its environment.

Installation instructions

Installing dependencies:

pip install -r requirements.txt

Download SC2 into the 3rdparty/ folder and copy the maps necessary to run over.

bash install_sc2.sh

Run an experiment

Before training your own transformer-based multi-agent model, there are a list of things to note.

  • Currently, this repository supports marine-based battle scenarios. e.g. 3m, 8m, 5m_vs_6m.
  • If you are interested in training a different unit type, carefully modify the Transformer Parameters block at src/config/default.yaml and revise the _build_input_transformer function in basic_controller.python.
  • Before running the experiment, check the agent type in Agent Parameters block at src/config/default.yaml.
  • This repository contains two new transformer-based agents from the UPDeT paper including
    • Standard UPDeT
    • Aggregation Transformer

Training script

python3 src/main.py --config=vdn --env-config=sc2 with env_args.map_name=5m_vs_6m

All results will be stored in the Results/ folder.

Performance

Single battle scenario

Surpass the GRU baseline on hard 5m_vs_6m with:

Multiple battle scenarios

Zero-shot generalize to different tasks:

  • Result on 7m-5m-3m transfer learning.

Note: Only UPDeT can be deployed to other scenarios without changing the model's architecture.

More details please refer to UPDeT paper.

Bibtex

@article{hu2021updet,
  title={UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers},
  author={Hu, Siyi and Zhu, Fengda and Chang, Xiaojun and Liang, Xiaodan},
  journal={arXiv preprint arXiv:2101.08001},
  year={2021}
}

License

The MIT License

Owner
hhhusiyi
hhhusiyi
Official Implementation of "Learning Disentangled Behavior Embeddings"

DBE: Disentangled-Behavior-Embedding Official implementation of Learning Disentangled Behavior Embeddings (NeurIPS 2021). Environment requirement The

Mishne Lab 12 Sep 28, 2022
Official pytorch implementation of the AAAI 2021 paper Semantic Grouping Network for Video Captioning

Semantic Grouping Network for Video Captioning Hobin Ryu, Sunghun Kang, Haeyong Kang, and Chang D. Yoo. AAAI 2021. [arxiv] Environment Ubuntu 16.04 CU

Hobin Ryu 43 Nov 25, 2022
TorchIO is a Medical image preprocessing and augmentation toolkit for deep learning. Part of the PyTorch Ecosystem.

Medical image preprocessing and augmentation toolkit for deep learning. Part of the PyTorch Ecosystem.

Fernando Pérez-García 1.6k Jan 06, 2023
Language-Agnostic Website Embedding and Classification

Homepage2Vec Language-Agnostic Website Embedding and Classification based on Curlie labels https://arxiv.org/pdf/2201.03677.pdf Homepage2Vec is a pre-

25 Dec 27, 2022
Customer Segmentation using RFM

Customer-Segmentation-using-RFM İş Problemi Bir e-ticaret şirketi müşterilerini segmentlere ayırıp bu segmentlere göre pazarlama stratejileri belirlem

Nazli Sener 7 Dec 26, 2021
Kaggle DSTL Satellite Imagery Feature Detection

Kaggle DSTL Satellite Imagery Feature Detection

Konstantin Lopuhin 206 Oct 29, 2022
Frequency Spectrum Augmentation Consistency for Domain Adaptive Object Detection

Frequency Spectrum Augmentation Consistency for Domain Adaptive Object Detection Main requirements torch = 1.0 torchvision = 0.2.0 Python 3 Environm

15 Apr 04, 2022
RobustART: Benchmarking Robustness on Architecture Design and Training Techniques

The first comprehensive Robustness investigation benchmark on large-scale dataset ImageNet regarding ARchitecture design and Training techniques towards diverse noises.

132 Dec 23, 2022
AI创造营 :Metaverse启动机之重构现世,结合PaddlePaddle 和 Wechaty 创造自己的聊天机器人

paddle-wechaty-Zodiac AI创造营 :Metaverse启动机之重构现世,结合PaddlePaddle 和 Wechaty 创造自己的聊天机器人 12星座若穿越科幻剧,会拥有什么超能力呢?快来迎接你的专属超能力吧! 现在很多年轻人都喜欢看科幻剧,像是复仇者系列,里面有很多英雄、超

105 Dec 22, 2022
A diff tool for language models

LMdiff Qualitative comparison of large language models. Demo & Paper: http://lmdiff.net LMdiff is a MIT-IBM Watson AI Lab collaboration between: Hendr

Hendrik Strobelt 27 Dec 29, 2022
Data visualization app for H&M competition in kaggle

handm_data_visualize_app Data visualization app by streamlit for H&M competition in kaggle. competition page: https://www.kaggle.com/competitions/h-an

Kyohei Uto 12 Apr 30, 2022
IMBENS: class-imbalanced ensemble learning in Python.

IMBENS: class-imbalanced ensemble learning in Python. Links: [Documentation] [Gallery] [PyPI] [Changelog] [Source] [Download] [知乎/Zhihu] [中文README] [a

Zhining Liu 176 Jan 04, 2023
Data-Driven Operational Space Control for Adaptive and Robust Robot Manipulation

OSCAR Project Page | Paper This repository contains the codebase used in OSCAR: Data-Driven Operational Space Control for Adaptive and Robust Robot Ma

NVIDIA Research Projects 74 Dec 22, 2022
Official Pytorch implementation of MixMo framework

MixMo: Mixing Multiple Inputs for Multiple Outputs via Deep Subnetworks Official PyTorch implementation of the MixMo framework | paper | docs Alexandr

79 Nov 07, 2022
TensorFlow-based implementation of "Pyramid Scene Parsing Network".

PSPNet_tensorflow Important Code is fine for inference. However, the training code is just for reference and might be only used for fine-tuning. If yo

HsuanKung Yang 323 Dec 20, 2022
PyTorch code for DriveGAN: Towards a Controllable High-Quality Neural Simulation

PyTorch code for DriveGAN: Towards a Controllable High-Quality Neural Simulation

76 Dec 24, 2022
Data and Code for ACL 2021 Paper "Inter-GPS: Interpretable Geometry Problem Solving with Formal Language and Symbolic Reasoning"

Introduction Code and data for ACL 2021 Paper "Inter-GPS: Interpretable Geometry Problem Solving with Formal Language and Symbolic Reasoning". We cons

Pan Lu 81 Dec 27, 2022
SysWhispers Shellcode Loader

Shhhloader Shhhloader is a SysWhispers Shellcode Loader that is currently a Work in Progress. It takes raw shellcode as input and compiles a C++ stub

icyguider 630 Jan 03, 2023
Intel® Neural Compressor is an open-source Python library running on Intel CPUs and GPUs

Intel® Neural Compressor targeting to provide unified APIs for network compression technologies, such as low precision quantization, sparsity, pruning, knowledge distillation, across different deep l

Intel Corporation 846 Jan 04, 2023
Lightweight stereo matching network based on MobileNetV1 and MobileNetV2

MobileStereoNet: Towards Lightweight Deep Networks for Stereo Matching

Cognitive Systems Research Group 139 Nov 30, 2022