TensorFlow implementation of Deep Reinforcement Learning papers

Last update: Jan 03, 2023

Overview

Deep Reinforcement Learning in TensorFlow

TensorFlow implementation of Deep Reinforcement Learning papers. This implementation contains:

[1] Playing Atari with Deep Reinforcement Learning
[2] Human-Level Control through Deep Reinforcement Learning
[3] Deep Reinforcement Learning with Double Q-learning
[4] Dueling Network Architectures for Deep Reinforcement Learning
[5] Prioritized Experience Replay (in progress)
[6] Deep Exploration via Bootstrapped DQN (in progress)
[7] Asynchronous Methods for Deep Reinforcement Learning (in progress)
[8] Continuous Deep q-Learning with Model-based Acceleration (in progress)

Requirements

Usage

First, install prerequisites with:

$ pip install -U 'gym[all]' tqdm scipy

Don't forget to also install the latest TensorFlow. Also note that you need to install the dependences of doom-py which is required by gym[all]

Train with DQN model described in [1] without gpu:

$ python main.py --network_header_type=nips --env_name=Breakout-v0 --use_gpu=False

Train with DQN model described in [2]:

$ python main.py --network_header_type=nature --env_name=Breakout-v0

Train with Double DQN model described in [3]:

$ python main.py --double_q=True --env_name=Breakout-v0

Train with Deuling network with Double Q-learning described in [4]:

$ python main.py --double_q=True --network_output_type=dueling --env_name=Breakout-v0

Train with MLP model described in [4] with corridor environment (useful for debugging):

$ python main.py --network_header_type=mlp --network_output_type=normal --observation_dims='[16]' --env_name=CorridorSmall-v5 --t_learn_start=0.1 --learning_rate_decay_step=0.1 --history_length=1 --n_action_repeat=1 --t_ep_end=10 --display=True --learning_rate=0.025 --learning_rate_minimum=0.0025
$ python main.py --network_header_type=mlp --network_output_type=normal --double_q=True --observation_dims='[16]' --env_name=CorridorSmall-v5 --t_learn_start=0.1 --learning_rate_decay_step=0.1 --history_length=1 --n_action_repeat=1 --t_ep_end=10 --display=True --learning_rate=0.025 --learning_rate_minimum=0.0025
$ python main.py --network_header_type=mlp --network_output_type=dueling --observation_dims='[16]' --env_name=CorridorSmall-v5 --t_learn_start=0.1 --learning_rate_decay_step=0.1 --history_length=1 --n_action_repeat=1 --t_ep_end=10 --display=True --learning_rate=0.025 --learning_rate_minimum=0.0025
$ python main.py --network_header_type=mlp --network_output_type=dueling --double_q=True --observation_dims='[16]' --env_name=CorridorSmall-v5 --t_learn_start=0.1 --learning_rate_decay_step=0.1 --history_length=1 --n_action_repeat=1 --t_ep_end=10 --display=True --learning_rate=0.025 --learning_rate_minimum=0.0025

Results

Result of Corridor-v5 in [4] for DQN (purple), DDQN (red), Dueling DQN (green), Dueling DDQN (blue).

Result of `Breakout-v0' for DQN without frame-skip (white-blue), DQN with frame-skip (light purple), Dueling DDQN (dark blue).

The hyperparameters and gradient clipping are not implemented as it is as [4].

References

Author

Taehoon Kim / @carpedm20

TensorFlow implementation of Deep Reinforcement Learning papers

Related tags

Overview

Deep Reinforcement Learning in TensorFlow

Requirements

Usage

Results

References

Author

Owner

Taehoon Kim

Repository features UNet inspired architecture used for segmenting lungs on chest X-Ray images

Official implementation of "Refiner: Refining Self-attention for Vision Transformers".

Simple Dynamic Batching Inference

DatasetGAN: Efficient Labeled Data Factory with Minimal Human Effort

Tooling for converting STAC metadata to ODC data model

Contains source code for the winning solution of the xView3 challenge

FADNet++: Real-Time and Accurate Disparity Estimation with Configurable Networks

Official implementation of "Learning Forward Dynamics Model and Informed Trajectory Sampler for Safe Quadruped Navigation" (RSS 2022)

A simple rest api serving a deep learning model that classifies human gender based on their faces. (vgg16 transfare learning)

Jupyter notebooks for the code samples of the book "Deep Learning with Python"

AFL binary instrumentation

Label-Free Model Evaluation with Semi-Structured Dataset Representations

Code for ACL 2019 Paper: "COMET: Commonsense Transformers for Automatic Knowledge Graph Construction"

QICK: Quantum Instrumentation Control Kit

Pyramid Pooling Transformer for Scene Understanding

Convert game ISO and archives to CD CHD for emulation on Linux.

DI-smartcross - Decision Intelligence Platform for Traffic Crossing Signal Control

Neural Scene Flow Prior (NeurIPS 2021 spotlight)

Synthesize photos from PhotoDNA using machine learning 🌱

ZSL-KG is a general-purpose zero-shot learning framework with a novel transformer graph convolutional network (TrGCN) to learn class representation from common sense knowledge graphs.