Tensorflow implementation of Human-Level Control through Deep Reinforcement Learning

Last update: Dec 26, 2022

Related tags

Deep Learning DQN-tensorflow

Overview

Human-Level Control through Deep Reinforcement Learning

Tensorflow implementation of Human-Level Control through Deep Reinforcement Learning.

This implementation contains:

Deep Q-network and Q-learning
Experience replay memory
- to reduce the correlations between consecutive updates
Network for Q-learning targets are fixed for intervals
- to reduce the correlations between target and predicted Q-values

Requirements

Python 2.7 or Python 3.3+
gym
tqdm
SciPy or OpenCV2
TensorFlow 0.12.0

Usage

First, install prerequisites with:

$ pip install tqdm gym[all]

To train a model for Breakout:

$ python main.py --env_name=Breakout-v0 --is_train=True
$ python main.py --env_name=Breakout-v0 --is_train=True --display=True

To test and record the screen with gym:

$ python main.py --is_train=False
$ python main.py --is_train=False --display=True

Results

Result of training for 24 hours using GTX 980 ti.

Simple Results

Details of Breakout with model m2(red) for 30 hours using GTX 980 Ti.

Details of Breakout with model m3(red) for 30 hours using GTX 980 Ti.

Detailed Results

[1] Action-repeat (frame-skip) of 1, 2, and 4 without learning rate decay

[2] Action-repeat (frame-skip) of 1, 2, and 4 with learning rate decay

[1] & [2]

[3] Action-repeat of 4 for DQN (dark blue) Dueling DQN (dark green) DDQN (brown) Dueling DDQN (turquoise)

The current hyper parameters and gradient clipping are not implemented as it is in the paper.

[4] Distributed action-repeat (frame-skip) of 1 without learning rate decay

[5] Distributed action-repeat (frame-skip) of 4 without learning rate decay

References

License

MIT License.

Tensorflow implementation of Human-Level Control through Deep Reinforcement Learning

Related tags

Overview

Human-Level Control through Deep Reinforcement Learning

Requirements

Usage

Results

Simple Results

Detailed Results

References

License

Owner

Devsisters Corp.

Script that attempts to force M1 macs into RGB mode when used with monitors that are defaulting to YPbPr.

Prototypical Networks for Few shot Learning in PyTorch

某学校选课系统GIF验证码数据集 + Baseline模型 + 上下游相关工具

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

Implementation of the paper titled "Using Sampling to Estimate and Improve Performance of Automated Scoring Systems with Guarantees"

Trains an agent with stochastic policy gradient ascent to solve the Lunar Lander challenge from OpenAI

This is an official implementation for "PlaneRecNet".

Poisson Surface Reconstruction for LiDAR Odometry and Mapping

Near-Duplicate Video Retrieval with Deep Metric Learning

PointPillars inference with TensorRT

Code repo for "RBSRICNN: Raw Burst Super-Resolution through Iterative Convolutional Neural Network" (Machine Learning and the Physical Sciences workshop in NeurIPS 2021).

A program that can analyze videos according to the weights you select

Applications using the GTN library and code to reproduce experiments in "Differentiable Weighted Finite-State Transducers"

Code to reproduce the experiments in the paper "Transformer Based Multi-Source Domain Adaptation" (EMNLP 2020)

Code for "NeRS: Neural Reflectance Surfaces for Sparse-View 3D Reconstruction in the Wild," in NeurIPS 2021

95.47% on CIFAR10 with PyTorch

This is the paddle code for SeBoW(Self-Born wiring for neural trees), a kind of neural tree born form a large search space

Real-time VIBE: Frame by Frame Inference of VIBE (Video Inference for Human Body Pose and Shape Estimation)

You are AllSet: A Multiset Function Framework for Hypergraph Neural Networks.

A parametric soroban written with CADQuery.