PyTorch implementation of Decoupling Value and Policy for Generalization in Reinforcement Learning

Related tags

Deep Learningidaac
Overview

IDAAC: Invariant Decoupled Advantage Actor-Critic

This is a PyTorch implementation of the methods proposed in

Decoupling Value and Policy for Generalization in Reinforcement Learning by

Roberta Raileanu and Rob Fergus.

Citation

If you use this code in your own work, please cite our paper:

@article{Raileanu2021DecouplingVA,
  title={Decoupling Value and Policy for Generalization in Reinforcement Learning},
  author={Roberta Raileanu and R. Fergus},
  journal={ArXiv},
  year={2021},
  volume={abs/2102.10330}
}

Requirements

To install all the required dependencies:

conda create -n idaac python=3.7
conda activate idaac

cd idaac
pip install -r requirements.txt

pip install procgen

git clone https://github.com/openai/baselines.git
cd baselines 
python setup.py install 

Instructions

This repo provides instructions for training IDAAC, DAAC, and PPO on the Procgen benchmark.

Train IDAAC on CoinRun

python train.py --env_name coinrun --algo idaac

Train DAAC on CoinRun

python train.py --env_name coinrun --algo daac

Train PPO on CoinRun

python train.py --env_name coinrun --algo ppo --ppo_epoch 3

Note: The default code uses the same set of hyperparameters (HPs) for all environments, which are the best ones overall. In our studies, we've found some of the games can further benefit from slightly different HPs, so we provide those as well. To use the best hyperparameters for each environment, use the flag --use_best_hps.

Overview of DAAC and IDAAC

IDAAC Overview

Procgen Results

IDAAC achieves state-of-the-art performance on the Procgen benchmark (easy mode), significantly improving the agent's generalization ability over standard RL methods such as PPO.

Test Results on Procgen

Procgen Test Results

Acknowledgements

This code was based on an open sourced PyTorch implementation of PPO.

A Decentralized Omnidirectional Visual-Inertial-UWB State Estimation System for Aerial Swar.

Omni-swarm A Decentralized Omnidirectional Visual-Inertial-UWB State Estimation System for Aerial Swarm Introduction Omni-swarm is a decentralized omn

HKUST Aerial Robotics Group 99 Dec 23, 2022
Paddle-Adversarial-Toolbox (PAT) is a Python library for Deep Learning Security based on PaddlePaddle.

Paddle-Adversarial-Toolbox Paddle-Adversarial-Toolbox (PAT) is a Python library for Deep Learning Security based on PaddlePaddle. Model Zoo Common FGS

AgentMaker 17 Nov 08, 2022
Mail classification with tensorflow and MS Exchange Server (ham or spam).

Mail classification with tensorflow and MS Exchange Server (ham or spam).

Metin Karatas 1 Sep 11, 2021
Implementation of ICCV19 Paper "Learning Two-View Correspondences and Geometry Using Order-Aware Network"

OANet implementation Pytorch implementation of OANet for ICCV'19 paper "Learning Two-View Correspondences and Geometry Using Order-Aware Network", by

Jiahui Zhang 225 Dec 05, 2022
Unofficial PyTorch implementation of "RTM3D: Real-time Monocular 3D Detection from Object Keypoints for Autonomous Driving" (ECCV 2020)

RTM3D-PyTorch The PyTorch Implementation of the paper: RTM3D: Real-time Monocular 3D Detection from Object Keypoints for Autonomous Driving (ECCV 2020

Nguyen Mau Dzung 271 Nov 29, 2022
Camera ready code repo for the NeuRIPS 2021 paper: "Impression learning: Online representation learning with synaptic plasticity".

Impression-Learning-Camera-Ready Camera ready code repo for the NeuRIPS 2021 paper: "Impression learning: Online representation learning with synaptic

2 Feb 09, 2022
Automated Evidence Collection for Fake News Detection

Automated Evidence Collection for Fake News Detection This is the code repo for the Automated Evidence Collection for Fake News Detection paper accept

Mrinal Rawat 2 Apr 12, 2022
Code for the KDD 2021 paper 'Filtration Curves for Graph Representation'

Filtration Curves for Graph Representation This repository provides the code from the KDD'21 paper Filtration Curves for Graph Representation. Depende

Machine Learning and Computational Biology Lab 16 Oct 16, 2022
Who calls the shots? Rethinking Few-Shot Learning for Audio (WASPAA 2021)

rethink-audio-fsl This repo contains the source code for the paper "Who calls the shots? Rethinking Few-Shot Learning for Audio." (WASPAA 2021) Table

Yu Wang 34 Dec 24, 2022
The code for our paper "AutoSF: Searching Scoring Functions for Knowledge Graph Embedding"

AutoSF The code for our paper "AutoSF: Searching Scoring Functions for Knowledge Graph Embedding" and this paper has been accepted by ICDE2020. News:

AutoML Research 64 Dec 17, 2022
Collection of generative models, e.g. GAN, VAE in Pytorch and Tensorflow.

Generative Models Collection of generative models, e.g. GAN, VAE in Pytorch and Tensorflow. Also present here are RBM and Helmholtz Machine. Note: Gen

Agustinus Kristiadi 7k Jan 02, 2023
[CVPR 2022 Oral] MixFormer: End-to-End Tracking with Iterative Mixed Attention

MixFormer The official implementation of the CVPR 2022 paper MixFormer: End-to-End Tracking with Iterative Mixed Attention [Models and Raw results] (G

Multimedia Computing Group, Nanjing University 235 Jan 03, 2023
Keras implementation of Real-Time Semantic Segmentation on High-Resolution Images

Keras-ICNet [paper] Keras implementation of Real-Time Semantic Segmentation on High-Resolution Images. Training in progress! Requisites Python 3.6.3 K

Aitor Ruano 87 Dec 16, 2022
Grow Function: Generate 3D Stacked Bifurcating Double Deep Cellular Automata based organisms which differentiate using a Genetic Algorithm...

Grow Function: A 3D Stacked Bifurcating Double Deep Cellular Automata which differentiates using a Genetic Algorithm... TLDR;High Def Trees that you can mint as NFTs on Solana

Nathaniel Gibson 4 Oct 08, 2022
Walk with fastai

Shield: This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. Walk with fastai What is this p

Walk with fastai 124 Dec 10, 2022
Inverse Optimal Control Adapted to the Noise Characteristics of the Human Sensorimotor System

Inverse Optimal Control Adapted to the Noise Characteristics of the Human Sensorimotor System This repository contains code for the paper Schultheis,

2 Oct 28, 2022
Project NII pytorch scripts

project-NII-pytorch-scripts By Xin Wang, National Institute of Informatics, since 2021 I am a new pytorch user. If you have any suggestions or questio

Yamagishi and Echizen Laboratories, National Institute of Informatics 184 Dec 23, 2022
A PyTorch implementation of deep-learning-based registration

DiffuseMorph Implementation A PyTorch implementation of deep-learning-based registration. Requirements OS : Ubuntu / Windows Python 3.6 PyTorch 1.4.0

24 Jan 03, 2023
Realtime_Multi-Person_Pose_Estimation

Introduction Multi Person PoseEstimation By PyTorch Results Require Pytorch Installation git submodule init && git submodule update Demo Download conv

tensorboy 1.3k Jan 05, 2023
List of awesome things around semantic segmentation 🎉

Awesome Semantic Segmentation List of awesome things around semantic segmentation 🎉 Semantic segmentation is a computer vision task in which we label

Dam Minh Tien 18 Nov 26, 2022