Multi-Agent Reinforcement Learning (MARL) method to learn scalable control polices for multi-agent target tracking.

Overview

scalableMARL

Scalable Reinforcement Learning Policies for Multi-Agent Control

CD. Hsu, H. Jeong, GJ. Pappas, P. Chaudhari. "Scalable Reinforcement Learning Policies for Multi-Agent Control". IEEE International Conference on Intelligent Robots and Systems (IROS), Prague, Czech Republic, 2021.

Multi-Agent Reinforcement Learning method to learn scalable control polices for multi-agent target tracking.

  • Author: Christopher Hsu
  • Email: [email protected]
  • Affiliation:
    • Department of Electrical and Systems Engineering
    • GRASP Laboratory
    • @ University of Pennsylvania

Currently supports Python3.8 and is developed in Ubuntu 20.04

scalableMARL file structure

Within scalableMARL (highlighting the important files):

scalableMARL
    |___algos
        |___maTT                          #RL alg folder for the target tracking environment
            |___core                      #Self-Attention-based Model Architecture
            |___core_behavior             #Used for further evaluation (Ablation D.2.)
            |___dql                       #Soft Double Q-Learning
            |___evaluation                #Evaluation for Main Results
            |___evaluation_behavior       #Used for further evaluation (Ablation D.2.)
            |___modules                   #Self-Attention blocks
            |___replay_buffer             #RL replay buffer for sets
            |___run_script                #**Main run script to do training and evaluation
    |___envs
        |___maTTenv                       #multi-agent target tracking
            |___env
                |___setTracking_v0        #Standard environment (i.e. 4a4t tasks)
                |___setTracking_vGreedy   #Baseline Greedy Heuristic
                |___setTracking_vGru      #Experiment with Gru (Ablation D.3)
                |___setTracking_vkGreedy  #Experiment with Scalability and Heuristic Mask k=4 (Ablation D.1)
        |___run_ma_tracking               #Example scipt to run environment
    |___setup                             #set PYTHONPATH ($source setup)
  • To setup scalableMARL, follow the instruction below.

Set up python environment for the scalableMARL repository

Install python3.8 (if it is not already installed)

#to check python version
python3 -V

sudo apt-get update
sudo apt-get install python3.8-dev

Set up virtualenv

Python virtual environments are used to isolate package installation from the system

Replace 'virtualenv name' with your choice of folder name

sudo apt-get install python3-venv 

python3 -m venv --system-site-packages ./'virtualenv name'
# Activate the environment for use, any commands now will be done within this venv
source ./'virtualenv name'/bin/activate

# To deactivate (in terminal, exit out of venv), do not use during setup
deactivate

Now that the virtualenv is activated, you can install packages that are isolated from your system

When the venv is activated, you can now install packages and run scripts

Install isolated packages in your venv

sudo apt-get install -y eog python3-tk python3-yaml python3-pip ssh git

#This command will auto install packages from requirements.txt
pip3 install --trusted-host pypi.python.org -r requirements.txt

Current workflow

Setup repos

# activate virtualenv
source ./'virtualenv name'/bin/activate
# change directory to scalableMARL
cd ./scalableMARL
# setup repo  ***important in order to set PYTHONPATH***
source setup

scalableMARL repo is ready to go

Running an algorithm (for example maPredPrey)

# its best to run from the scalableMARL folder so that logging and saving is consistent
cd ./scalableMARL
# run the alg
python3 algos/maTT/run_script.py

# you can run the alg with different argument parameters. See within run_script for more options.
# for example
python3 algos/maTT/run_script.py --seed 0 --logdir ./results/maPredPrey --epochs 40

To test, evaluate, and render()

# for a general example 
python3 algos/maTT/run_script.py --mode test --render 1 --log_dir ./results/maTT/setTracking-v0_123456789/seed_0/ --nb_test_eps 50
# for a saved policy in saved_results
python3 algos/maTT/run_script.py --mode test --render 1 --log_dir ./saved_results/maTT/setTracking-v0_123456789/seed_0/

To see training curves

tensorboard --logdir ./results/maTT/setTracking-v0_123456789/

Citing scalableMARL

If you reference or use scalableMARL in your research, please cite:

@misc{hsu2021scalable,
      title={Scalable Reinforcement Learning Policies for Multi-Agent Control}, 
      author={Christopher D. Hsu and Heejin Jeong and George J. Pappas and Pratik Chaudhari},
      year={2021},
      eprint={2011.08055},
      archivePrefix={arXiv},
      primaryClass={cs.MA}
}

Owner
Christopher Hsu
Christopher Hsu
Confidence Propagation Cluster aims to replace NMS-based methods as a better box fusion framework in 2D/3D Object detection

CP-Cluster Confidence Propagation Cluster aims to replace NMS-based methods as a better box fusion framework in 2D/3D Object detection, Instance Segme

Yichun Shen 41 Dec 08, 2022
Human Detection - Pedestrian Detection using OpenCV Python

Pedestrian Detection using OpenCV Python Follow us on Instagram for Machine Lear

Hrishikesh Dutta 1 Jan 23, 2022
Conditional Generative Adversarial Networks (CGAN) for Mobility Data Fusion

This code implements the paper, Kim et al. (2021). Imputing Qualitative Attributes for Trip Chains Extracted from Smart Card Data Using a Conditional Generative Adversarial Network. Transportation Re

Eui-Jin Kim 2 Feb 03, 2022
Anomaly Detection Based on Hierarchical Clustering of Mobile Robot Data

We proposed a new approach to detect anomalies of mobile robot data. We investigate each data seperately with two clustering method hierarchical and k-means. There are two sub-method that we used for

Zekeriyya Demirci 1 Jan 09, 2022
Seeing Dynamic Scene in the Dark: High-Quality Video Dataset with Mechatronic Alignment (ICCV2021)

Seeing Dynamic Scene in the Dark: High-Quality Video Dataset with Mechatronic Alignment This is a pytorch project for the paper Seeing Dynamic Scene i

DV Lab 21 Nov 28, 2022
E2e music remastering system - End-to-end Music Remastering System Using Self-supervised and Adversarial Training

End-to-end Music Remastering System This repository includes source code and pre

Junghyun (Tony) Koo 37 Dec 15, 2022
SporeAgent: Reinforced Scene-level Plausibility for Object Pose Refinement

SporeAgent: Reinforced Scene-level Plausibility for Object Pose Refinement This repository implements the approach described in SporeAgent: Reinforced

Dominik Bauer 5 Jan 02, 2023
This repository implements variational graph auto encoder by Thomas Kipf.

Variational Graph Auto-encoder in Pytorch This repository implements variational graph auto-encoder by Thomas Kipf. For details of the model, refer to

DaehanKim 215 Jan 02, 2023
Multi-task Self-supervised Object Detection via Recycling of Bounding Box Annotations (CVPR, 2019)

Multi-task Self-supervised Object Detection via Recycling of Bounding Box Annotations (CVPR 2019) To make better use of given limited labels, we propo

126 Sep 13, 2022
Code for ICLR 2020 paper "VL-BERT: Pre-training of Generic Visual-Linguistic Representations".

VL-BERT By Weijie Su, Xizhou Zhu, Yue Cao, Bin Li, Lewei Lu, Furu Wei, Jifeng Dai. This repository is an official implementation of the paper VL-BERT:

Weijie Su 698 Dec 18, 2022
nfelo: a power ranking, prediction, and betting model for the NFL

nfelo nfelo is a power ranking, prediction, and betting model for the NFL. Nfelo take's 538's Elo framework and further adapts it for the NFL, hence t

6 Nov 22, 2022
Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation, CVPR 2018

Learning Pixel-level Semantic Affinity with Image-level Supervision This code is deprecated. Please see https://github.com/jiwoon-ahn/irn instead. Int

Jiwoon Ahn 337 Dec 15, 2022
GPT, but made only out of gMLPs

GPT - gMLP This repository will attempt to crack long context autoregressive language modeling (GPT) using variations of gMLPs. Specifically, it will

Phil Wang 80 Dec 01, 2022
InDuDoNet+: A Model-Driven Interpretable Dual Domain Network for Metal Artifact Reduction in CT Images

InDuDoNet+: A Model-Driven Interpretable Dual Domain Network for Metal Artifact Reduction in CT Images Hong Wang, Yuexiang Li, Haimiao Zhang, Deyu Men

Hong Wang 4 Dec 27, 2022
Template repository for managing machine learning research projects built with PyTorch-Lightning

Tutorial Repository with a minimal example for showing how to deploy training across various compute infrastructure.

Sidd Karamcheti 3 Feb 11, 2022
Tello Drone Trajectory Tracking

With this library you can track the trajectory of your tello drone or swarm of drones in real time.

Kamran Asgarov 2 Oct 12, 2022
Mapping Conditional Distributions for Domain Adaptation Under Generalized Target Shift

This repository contains the official code of OSTAR in "Mapping Conditional Distributions for Domain Adaptation Under Generalized Target Shift" (ICLR 2022).

Matthieu Kirchmeyer 5 Dec 06, 2022
DeLighT: Very Deep and Light-Weight Transformers

DeLighT: Very Deep and Light-weight Transformers This repository contains the source code of our work on building efficient sequence models: DeFINE (I

Sachin Mehta 440 Dec 18, 2022
Supervised Contrastive Learning for Product Matching

Contrastive Product Matching This repository contains the code and data download links to reproduce the experiments of the paper "Supervised Contrasti

Web-based Systems Group @ University of Mannheim 18 Dec 10, 2022
ROCKET: Exceptionally fast and accurate time series classification using random convolutional kernels

ROCKET + MINIROCKET ROCKET: Exceptionally fast and accurate time series classification using random convolutional kernels. Data Mining and Knowledge D

298 Dec 26, 2022