PyTorch implementation of Advantage async actor-critic Algorithms (A3C) in PyTorch

Last update: Dec 08, 2022

Overview

Advantage async actor-critic Algorithms (A3C) in PyTorch

@inproceedings{mnih2016asynchronous,
  title={Asynchronous methods for deep reinforcement learning},
  author={Mnih, Volodymyr and Badia, Adria Puigdomenech and Mirza, Mehdi and Graves, Alex and Lillicrap, Timothy P and Harley, Tim and Silver, David and Kavukcuoglu, Koray},
  booktitle={International Conference on Machine Learning},
  year={2016}}

This repository contains an implementation of Adavantage async Actor-Critic (A3C) in PyTorch based on the original paper by the authors and the PyTorch implementation by Ilya Kostrikov.

A3C is the state-of-art Deep Reinforcement Learning method.

Dependencies

Python 2.7
PyTorch
gym (OpenAI)
universe (OpenAI)
opencv (for env state processing)
visdom (for visualization)

Training

./train_lstm.sh

Test wigh trained weight after 169000 updates for PongDeterminisitc-v3.

./test_lstm.sh 169000

A test result video is available.

PyTorch implementation of Advantage async actor-critic Algorithms (A3C) in PyTorch

Related tags

Overview

Advantage async actor-critic Algorithms (A3C) in PyTorch

Dependencies

Training

Test wigh trained weight after 169000 updates for PongDeterminisitc-v3.

Check the loss curves of all threads in http://localhost:8097

References

Owner

LEI TAI

Sync2Gen Code for ICCV 2021 paper: Scene Synthesis via Uncertainty-Driven Attribute Synchronization

Discord bot-CTFD-Thread-Parser - Discord bot CTFD-Thread-Parser

This project helps to colorize grayscale images using multiple exemplars.

Pytorch0.4.1 codes for InsightFace

Convolutional neural network web app trained to track our infant’s sleep schedule using our Google Nest camera.

[CVPR'21] FedDG: Federated Domain Generalization on Medical Image Segmentation via Episodic Learning in Continuous Frequency Space

Semi-supervised Domain Adaptation via Minimax Entropy

Gym environment for FLIPIT: The Game of "Stealthy Takeover"

Unsupervised MRI Reconstruction via Zero-Shot Learned Adversarial Transformers

The project page of paper: Architecture disentanglement for deep neural networks [ICCV 2021, oral]

Implementation of Sequence Generative Adversarial Nets with Policy Gradient

A Conditional Point Diffusion-Refinement Paradigm for 3D Point Cloud Completion

Visual Question Answering in Pytorch

AAAI 2022 paper - Unifying Model Explainability and Robustness for Joint Text Classification and Rationale Extraction

Fast sparse deep learning on CPUs

A simple baseline for the 2022 IEEE GRSS Data Fusion Contest (DFC2022)

Simple Tensorflow implementation of "Adaptive Convolutions for Structure-Aware Style Transfer" (CVPR 2021)

Character-Input - Create a program that asks the user to enter their name and their age

General purpose GPU compute framework for cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends)

Probabilistic Cross-Modal Embedding (PCME) CVPR 2021