The mini-AlphaStar (mini-AS, or mAS) - mini-scale version (non-official) of the AlphaStar (AS)

Overview

mini-AlphaStar

Introduction

The mini-AlphaStar (mini-AS, or mAS) project is a mini-scale version (non-official) of the AlphaStar (AS). AlphaStar is the intelligent AI proposed by DeepMind to play StarCraft II.

The "mini-scale" means making the original AS's hyper-parameters adjustable so that mini-AS can be trained and running on a small scale. E.g., we can train this model in a single commercial server machine.

We referred to the "Occam's Razor Principle" when designing the mini-AS": simple is sound. Therefore, we build mini-AS from scratch. Unless the function significantly impacts speed and performance, we shall omit it.

Meanwhile, we also try not to use too many dependency packages so that mini-AS should only depend on the PyTorch. In this way, we simplify the learning cost of the mini-AS and make the architecture of mini-AS relatively easy.

The Chinese shows a simple readme in Chinese.

Below 4 GIFs are mini-AS' trained performance on Simple64, supervised learning on 50 expert replays.

Left: At the start of the game. Right: In the middle period of the game.

Left: The agent's 1st attack. Right: The agent's 2nd Attack.

Update

This release is the "v_1.07" version. In this version, we give an agent which grows from 0.016 to 0.5667 win rate against the level-2 built-in bot training by reinforcement learning. Other improvements are shown below:

  • Use mimic_forward to replace forward in "rl_unroll", which increase the training accuracy;
  • Make RL training supports multi-GPU now;
  • Make RL training supports multi-process training based on multi-GPU now;
  • Use new architecture for RL loss, which reduces 86% GPU memory;
  • Use new architecture for RL to increase the sampling speed by 6x faster;
  • Validate UPGO and V-trace loss again;
  • By a "multi-process plus multi-thread" training, increase the sampling speed more by 197%;
  • Fix the GPU memory leak and reduce the CPU memory leak;
  • Increase the RL training win rate (without units loss) on level-2 to 0.57!

Hints

Warning: SC2 is extremely difficult, and AlphaStar is also very complex. Even our project is a mini-AlphaStar, it has almost the similar technologies as AS, and the training resource also costs very high. We can hardly train mini-AS on a laptop. The recommended way is to use a commercial server with a GPU card and enough large memory and disk space. For someone interested in this project for the first time, we recommend you collect (star) this project and devolve deeply into researching it when you have enough free time and training resources.

Location

We store the codes and show videos in two places.

Codes location Result video location Usage
Github Youtube for global users
Gitee Bilibili for users in China

Contents

The table below shows the corresponding packages in the project.

Packages Content
alphastarmini.core.arch deep neural architecture
alphastarmini.core.sl supervised learning
alphastarmini.core.rl reinforcement learning
alphastarmini.core.ma multi-agent league traning
alphastarmini.lib lib functions
alphastarmini.third third party functions

Requirements

PyTorch >= 1.5, others please see requirements.txt.

Install

The SCRIPT Guide gives some commands to install PyTorch by conda (this will automatically install CUDA and cudnn, which is convenient).

E.g., like (to install PyTorch 1.5 with accompanied CUDA and cudnn):

conda create -n th_1_5 python=3.7 pytorch=1.5 -c pytorch

Next, activate the conda environment, like:

conda activate th_1_5

Then you can install other python packages by pip, e.g., the command in the below line:

pip install -r requirements.txt

Usage

After you have done all requirements, run the below python file to run the program:

python run.py

You may use comments and uncomments in "run.py" to select the training process you want.

The USAGE Guide provides answers to some problems and questions.

You should follow the following instructions to get results similar and/or better than the provided gifs on the main page.

The processing sequences can be summarised as the following:

  1. Transform replays: download the replays for training, then use the script in mAS to transform the replays to trainable data;
  2. Supervised learning: use the trainable data to supervise learning an initial model;
  3. Evaluate SL model: the trained SL model should be evaluated on the RL environment to make sure it behaves right;
  4. Reinforcement learning: use the trained SL model to do reinforcement learning in the SC environment, seeing the win rate starts growing.

We give detailed descriptions below.

Transofrm replays

In supervised learning, you first need to download SC2 replays.

The REPLAY Guide shows a guide to download these SC2 replays.

The ZHIHU Guide provides Chinese users who are not convenient to use Battle.net (outside China) a guide to download replays.

After downloading replays, you should move the replays to "./data/Replays/filtered_replays_1" (you can change the name in transform_replay_data.py).

Then use transform_replay_data.py to transform these replays to pickles or tensors (you can change the output type in the code of that file).

You don't need to run the transform_replay_data.py directly. Only run "run.py" is OK. Make the run.py has the following code

    # from alphastarmini.core.sl import transform_replay_data
    # transform_replay_data.test(on_server=P.on_server)

uncommented. Then you can directly run "run.py".

Note: To get the effect of the trained agent in the gifs, use the replays in Useful-Big-Resources. These replays are generatedy by our experts, to get an agent having the ability to win the built-in bot.

Supervised learning

After getting the trainable data (we use tensor data). Make the run.py has the following code

    # from alphastarmini.core.sl import sl_train_by_tensor
    # sl_train_by_tensor.test(on_server=P.on_server)

uncommented. Then you can directly run "run.py" to do supervised learning.

The default learning rate is 1e-4, and the training epochs should best be 10 (more epochs may cause the training effect overfitting).

From the v_1.05 version, we start to support multi-GPU supervised learning training for mini-AS, improving the training speed. The way to use multi-GPU training is straightforward, as follows:

python run_multi-gpu.py

Multi-GPU training has some unstable factors (caused because of PyTorch). If you find your multi-GPU training has training instability errors, please switch to the single-GPU training.

We currently support four types of supervised training, which all reside in the "alphastarmini.core.sl" package.

File Content
sl_train_by_pickle.py pickle (data not preprocessed) training: Slow, but need small disk space.
sl_train_by_tensor.py tensor (data preprocessed) training: Fast, but cost colossal disk space.
sl_multi_gpu_by_pickle.py multi-GPU, pickle training: It has a requirement need for large shared memory.
sl_multi_gpu_by_tensor.py multi-GPU, tensor training: It needs both large memory and large shared memory.

You can use the load_pickle.py to transform the generated pickles (in "./data/replay_data") to tensors (in "./data/replay_data_tensor").

Note: from v_1.06, we still recommend using single-GPU training. We provide the new training ways in the single-GPU type. This is due to multi-GPU training cost so much memory.

Evaluate SL model

After getting the supervised learning model. We should test the performance of the model in the SC2 environment this is due to there is domain shift from SL data and RL environment.

Make the run.py has the following code

    # from alphastarmini.core.rl import rl_eval_sl
    # rl_eval_sl.test(on_server=P.on_server)

uncommented. Then you can directly run "run.py" to do an evaluation of the SL model.

The evaluation is similar to RL training but the learning is closed and the running is single-thread and single-process, to make the randomness due to multi-thread not affect the evaluation.

Reinforcement learning

After making sure the supervised learning model is OK and suitable for RL training. We do RL training based on the learned supervised learning model.

Make the run.py has the following code

    # from alphastarmini.core.rl import rl_vs_inner_bot_mp
    # rl_vs_inner_bot_mp.test(on_server=P.on_server, replay_path=P.replay_path)

uncommented. Then you can directly run "run.py" to do reinforcement learning.

Note, this training will use a multi-process plus multi-thread RL training (to accelerate the learning speed), so make sure to run this codes on a high-performance computer.

E.g., we run 15 processes, and each process has 2 actor threads and 1 learner thread in a commercial server. If your computer is not strong as that, reduce the parallel and thread nums.

The learning rate should be very small (below 1e-5, because you are training on an initially trained model), and the training iterations should be as long as best (more training iterations can reduce the unstable of RL training).

If you find the training is not as like as you imagine, please open an issue to ask us or discuss with us (though we can not make sure to respond to it in time or there is a solution to every problem).

Results

Here are some illustration figures of the SL training process below:

SL training process

We can see the loss (one primary loss and six argument losses) fall quickly.

The trained behavior of the agents can be seen in the gifs on this page.

A more detailed illustration of the experiments (such as the effects of the different hyper-parameters) will be provided in our later paper.

History

The HISTORY is the historical introduction of the previous versions of mini-AS.

Citing

If you find our repository useful, please cite our project or the below technical report:

@misc{liu2021mAS,
  author = {Ruo{-}Ze Liu and Wenhai Wang and Yang Yu and Tong Lu},
  title = {mini-AlphaStar},
  year = {2021},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/liuruoze/mini-AlphaStar}},
}

The An Introduction of mini-AlphaStar is a technical report introducing the mini-AS (not full version).

@article{liu2021mASreport,
  author    = {Ruo{-}Ze Liu and
               Wenhai Wang and
               Yanjie Shen and
               Zhiqi Li and
               Yang Yu and
               Tong Lu},
  title     = {An Introduction of mini-AlphaStar},
  journal   = {CoRR},
  volume    = {abs/2104.06890},
  year      = {2021},
}

Rethinking

The Rethinking of AlphaStar is our thinking of the advantages and disadvantages of AlphaStar.

Paper

We will give a paper (which is now under peer-review) that may be available in the future, presenting detailed experiments and evaluations using the mini-AS.

Owner
Ruo-Ze Liu
Think deep, work hard.
Ruo-Ze Liu
This app is a simple example of using Strealit to create a financial data web app.

Streamlit Demo: Finance Chart This app is a simple example of using Streamlit to create a financial data web app. This demo use streamlit, pandas and

91 Jan 02, 2023
Model-based 3D Hand Reconstruction via Self-Supervised Learning, CVPR2021

S2HAND: Model-based 3D Hand Reconstruction via Self-Supervised Learning S2HAND presents a self-supervised 3D hand reconstruction network that can join

Yujin Chen 72 Dec 12, 2022
[ICCV 2021] Official Pytorch implementation for Discriminative Region-based Multi-Label Zero-Shot Learning SOTA results on NUS-WIDE and OpenImages

Discriminative Region-based Multi-Label Zero-Shot Learning (ICCV 2021) [arXiv][Project page coming soon] Sanath Narayan*, Akshita Gupta*, Salman Kh

Akshita Gupta 54 Nov 21, 2022
KIDA: Knowledge Inheritance in Data Aggregation

KIDA: Knowledge Inheritance in Data Aggregation This project releases our 1st place solution on NeurIPS2021 ML4CO Dual Task. Slide and model weights a

24 Sep 08, 2022
Fast sparse deep learning on CPUs

SPARSEDNN **If you want to use this repo, please send me an email: [email pro

Ziheng Wang 44 Nov 30, 2022
Source code and dataset for ACL2021 paper: "ERICA: Improving Entity and Relation Understanding for Pre-trained Language Models via Contrastive Learning".

ERICA Source code and dataset for ACL2021 paper: "ERICA: Improving Entity and Relation Understanding for Pre-trained Language Models via Contrastive L

THUNLP 75 Nov 02, 2022
The final project of "Applying AI to 3D Medical Imaging Data" from "AI for Healthcare" nanodegree - Udacity.

Quantifying Hippocampus Volume for Alzheimer's Progression Background Alzheimer's disease (AD) is a progressive neurodegenerative disorder that result

Omar Laham 1 Jan 14, 2022
working repo for my xumx-sliCQ submissions to the ISMIR 2021 MDX

Music Demixing Challenge - xumx-sliCQ This repository is the GitHub mirror of my working submission repository for the AICrowd ISMIR 2021 Music Demixi

4 Aug 25, 2021
Code for 2021 NeurIPS --- Towards Multi-Grained Explainability for Graph Neural Networks

ReFine: Multi-Grained Explainability for GNNs This is the official code for Towards Multi-Grained Explainability for Graph Neural Networks (NeurIPS 20

Shirley (Ying-Xin) Wu 47 Dec 16, 2022
This is the official repository for our paper: ''Pruning Self-attentions into Convolutional Layers in Single Path''.

Pruning Self-attentions into Convolutional Layers in Single Path This is the official repository for our paper: Pruning Self-attentions into Convoluti

Zhuang AI Group 77 Dec 26, 2022
Learning to Reconstruct 3D Non-Cuboid Room Layout from a Single RGB Image

NonCuboidRoom Paper Learning to Reconstruct 3D Non-Cuboid Room Layout from a Single RGB Image Cheng Yang*, Jia Zheng*, Xili Dai, Rui Tang, Yi Ma, Xiao

67 Dec 15, 2022
Pyramid Pooling Transformer for Scene Understanding

Pyramid Pooling Transformer for Scene Understanding Requirements: torch 1.6+ torchvision 0.7.0 timm==0.3.2 Validated on torch 1.6.0, torchvision 0.7.0

Yu-Huan Wu 119 Dec 29, 2022
Liquid Warping GAN with Attention: A Unified Framework for Human Image Synthesis

Liquid Warping GAN with Attention: A Unified Framework for Human Image Synthesis, including human motion imitation, appearance transfer, and novel view synthesis. Currently the paper is under review

2.3k Jan 05, 2023
Official PyTorch implementation of CAPTRA: CAtegory-level Pose Tracking for Rigid and Articulated Objects from Point Clouds

CAPTRA: CAtegory-level Pose Tracking for Rigid and Articulated Objects from Point Clouds Introduction This is the official PyTorch implementation of o

Yijia Weng 96 Dec 07, 2022
PyTorch implementation of ENet

PyTorch-ENet PyTorch (v1.1.0) implementation of ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation, ported from the lua-torc

David Silva 333 Dec 29, 2022
Fine-tune pretrained Convolutional Neural Networks with PyTorch

Fine-tune pretrained Convolutional Neural Networks with PyTorch. Features Gives access to the most popular CNN architectures pretrained on ImageNet. A

Alex Parinov 694 Nov 23, 2022
Pytorch implementation of "Geometrically Adaptive Dictionary Attack on Face Recognition" (WACV 2022)

Geometrically Adaptive Dictionary Attack on Face Recognition This is the Pytorch code of our paper "Geometrically Adaptive Dictionary Attack on Face R

6 Nov 21, 2022
This is a repository of our model for weakly-supervised video dense anticipation.

Introduction This is a repository of our model for weakly-supervised video dense anticipation. More results on GTEA, Epic-Kitchens etc. will come soon

2 Apr 09, 2022
The official implementation of Theme Transformer

Theme Transformer This is the official implementation of Theme Transformer. Checkout our demo and paper : Demo | arXiv Environment: using python versi

Ian Shih 85 Dec 08, 2022
Official Implementation for HyperStyle: StyleGAN Inversion with HyperNetworks for Real Image Editing

HyperStyle: StyleGAN Inversion with HyperNetworks for Real Image Editing Yuval Alaluf*, Omer Tov*, Ron Mokady, Rinon Gal, Amit H. Bermano *Denotes equ

885 Jan 06, 2023