PyTorch implementations of deep reinforcement learning algorithms and environments

Overview

Deep Reinforcement Learning Algorithms with PyTorch

Travis CI contributions welcome

RL PyTorch

This repository contains PyTorch implementations of deep reinforcement learning algorithms and environments.

(To help you remember things you learn about machine learning in general write them in Save All and try out the public deck there about Fast AI's machine learning textbook.)

Algorithms Implemented

  1. Deep Q Learning (DQN) (Mnih et al. 2013)
  2. DQN with Fixed Q Targets (Mnih et al. 2013)
  3. Double DQN (DDQN) (Hado van Hasselt et al. 2015)
  4. DDQN with Prioritised Experience Replay (Schaul et al. 2016)
  5. Dueling DDQN (Wang et al. 2016)
  6. REINFORCE (Williams et al. 1992)
  7. Deep Deterministic Policy Gradients (DDPG) (Lillicrap et al. 2016 )
  8. Twin Delayed Deep Deterministic Policy Gradients (TD3) (Fujimoto et al. 2018)
  9. Soft Actor-Critic (SAC) (Haarnoja et al. 2018)
  10. Soft Actor-Critic for Discrete Actions (SAC-Discrete) (Christodoulou 2019)
  11. Asynchronous Advantage Actor Critic (A3C) (Mnih et al. 2016)
  12. Syncrhonous Advantage Actor Critic (A2C)
  13. Proximal Policy Optimisation (PPO) (Schulman et al. 2017)
  14. DQN with Hindsight Experience Replay (DQN-HER) (Andrychowicz et al. 2018)
  15. DDPG with Hindsight Experience Replay (DDPG-HER) (Andrychowicz et al. 2018 )
  16. Hierarchical-DQN (h-DQN) (Kulkarni et al. 2016)
  17. Stochastic NNs for Hierarchical Reinforcement Learning (SNN-HRL) (Florensa et al. 2017)
  18. Diversity Is All You Need (DIAYN) (Eyensbach et al. 2018)

All implementations are able to quickly solve Cart Pole (discrete actions), Mountain Car Continuous (continuous actions), Bit Flipping (discrete actions with dynamic goals) or Fetch Reach (continuous actions with dynamic goals). I plan to add more hierarchical RL algorithms soon.

Environments Implemented

  1. Bit Flipping Game (as described in Andrychowicz et al. 2018)
  2. Four Rooms Game (as described in Sutton et al. 1998)
  3. Long Corridor Game (as described in Kulkarni et al. 2016)
  4. Ant-{Maze, Push, Fall} (as desribed in Nachum et al. 2018 and their accompanying code)

Results

1. Cart Pole and Mountain Car

Below shows various RL algorithms successfully learning discrete action game Cart Pole or continuous action game Mountain Car. The mean result from running the algorithms with 3 random seeds is shown with the shaded area representing plus and minus 1 standard deviation. Hyperparameters used can be found in files results/Cart_Pole.py and results/Mountain_Car.py.

Cart Pole and Mountain Car Results

2. Hindsight Experience Replay (HER) Experiements

Below shows the performance of DQN and DDPG with and without Hindsight Experience Replay (HER) in the Bit Flipping (14 bits) and Fetch Reach environments described in the papers Hindsight Experience Replay 2018 and Multi-Goal Reinforcement Learning 2018. The results replicate the results found in the papers and show how adding HER can allow an agent to solve problems that it otherwise would not be able to solve at all. Note that the same hyperparameters were used within each pair of agents and so the only difference between them was whether hindsight was used or not.

HER Experiment Results

3. Hierarchical Reinforcement Learning Experiments

The results on the left below show the performance of DQN and the algorithm hierarchical-DQN from Kulkarni et al. 2016 on the Long Corridor environment also explained in Kulkarni et al. 2016. The environment requires the agent to go to the end of a corridor before coming back in order to receive a larger reward. This delayed gratification and the aliasing of states makes it a somewhat impossible game for DQN to learn but if we introduce a meta-controller (as in h-DQN) which directs a lower-level controller how to behave we are able to make more progress. This aligns with the results found in the paper.

The results on the right show the performance of DDQN and algorithm Stochastic NNs for Hierarchical Reinforcement Learning (SNN-HRL) from Florensa et al. 2017. DDQN is used as the comparison because the implementation of SSN-HRL uses 2 DDQN algorithms within it. Note that the first 300 episodes of training for SNN-HRL were used for pre-training which is why there is no reward for those episodes.

Long Corridor and Four Rooms

Usage

The repository's high-level structure is:

├── agents                    
    ├── actor_critic_agents   
    ├── DQN_agents         
    ├── policy_gradient_agents
    └── stochastic_policy_search_agents 
├── environments   
├── results             
    └── data_and_graphs        
├── tests
├── utilities             
    └── data structures            

i) To watch the agents learn the above games

To watch all the different agents learn Cart Pole follow these steps:

git clone https://github.com/p-christ/Deep_RL_Implementations.git
cd Deep_RL_Implementations

conda create --name myenvname
y
conda activate myenvname

pip3 install -r requirements.txt

python results/Cart_Pole.py

For other games change the last line to one of the other files in the Results folder.

ii) To train the agents on another game

Most Open AI gym environments should work. All you would need to do is change the config.environment field (look at Results/Cart_Pole.py for an example of this).

You can also play with your own custom game if you create a separate class that inherits from gym.Env. See Environments/Four_Rooms_Environment.py for an example of a custom environment and then see the script Results/Four_Rooms.py to see how to have agents play the environment.

Owner
Petros Christodoulou
Petros Christodoulou
Confidence Propagation Cluster aims to replace NMS-based methods as a better box fusion framework in 2D/3D Object detection

CP-Cluster Confidence Propagation Cluster aims to replace NMS-based methods as a better box fusion framework in 2D/3D Object detection, Instance Segme

Yichun Shen 41 Dec 08, 2022
A library for researching neural networks compression and acceleration methods.

A library for researching neural networks compression and acceleration methods.

Intel Labs 100 Dec 29, 2022
Mixed Transformer UNet for Medical Image Segmentation

MT-UNet Update 2021/11/19 Thank you for your interest in our work. We have uploaded the code of our MTUNet to help peers conduct further research on i

dotman 92 Dec 25, 2022
The description of FMFCC-A (audio track of FMFCC) dataset and Challenge resluts.

FMFCC-A This project is the description of FMFCC-A (audio track of FMFCC) dataset and Challenge resluts. The FMFCC-A dataset is shared through BaiduCl

18 Dec 24, 2022
Pytorch implementation of PTNet for high-resolution and longitudinal infant MRI synthesis

Pyramid Transformer Net (PTNet) Project | Paper Pytorch implementation of PTNet for high-resolution and longitudinal infant MRI synthesis. PTNet: A Hi

Xuzhe Johnny Zhang 6 Jun 08, 2022
Activity tragle - Google is tracking everything, we just look at it

activity_tragle Google is tracking everything, we just look at it here. You need

BERNARD Guillaume 1 Feb 15, 2022
Predict halo masses from simulations via graph neural networks

HaloGraphNet Predict halo masses from simulations via Graph Neural Networks. Given a dark matter halo and its galaxies, creates a graph with informati

Pablo Villanueva Domingo 20 Nov 15, 2022
A facial recognition doorbell system using a Raspberry Pi

Facial Recognition Doorbell This project expands on the person-detecting doorbell system to allow it to identify faces, and announce names accordingly

rydercalmdown 22 Apr 15, 2022
OpenLT: An open-source project for long-tail classification

OpenLT: An open-source project for long-tail classification Supported Methods for Long-tailed Recognition: Cross-Entropy Loss Focal Loss (ICCV'17) Cla

Ming Li 37 Sep 15, 2022
3D detection and tracking viewer (visualization) for kitti & waymo dataset

3D detection and tracking viewer (visualization) for kitti & waymo dataset

222 Jan 08, 2023
Code for SIMMC 2.0: A Task-oriented Dialog Dataset for Immersive Multimodal Conversations

The Second Situated Interactive MultiModal Conversations (SIMMC 2.0) Challenge 2021 Welcome to the Second Situated Interactive Multimodal Conversation

Facebook Research 81 Nov 22, 2022
A testcase generation tool for Persistent Memory Programs.

PMFuzz PMFuzz is a testcase generation tool to generate high-value tests cases for PM testing tools (XFDetector, PMDebugger, PMTest and Pmemcheck) If

Systems Research at ShiftLab 14 Jul 24, 2022
CPU inference engine that delivers unprecedented performance for sparse models

The DeepSparse Engine is a CPU runtime that delivers unprecedented performance by taking advantage of natural sparsity within neural networks to reduce compute required as well as accelerate memory b

Neural Magic 1.2k Jan 09, 2023
Implementation of Shape Generation and Completion Through Point-Voxel Diffusion

Shape Generation and Completion Through Point-Voxel Diffusion Project | Paper Implementation of Shape Generation and Completion Through Point-Voxel Di

Linqi Zhou 103 Dec 29, 2022
Pose estimation for iOS and android using TensorFlow 2.0

💃 Mobile 2D Single Person (Or Your Own Object) Pose Estimation for TensorFlow 2.0 This repository is forked from edvardHua/PoseEstimationForMobile wh

tucan9389 165 Nov 16, 2022
The repo contains the code of the ACL2020 paper `Dice Loss for Data-imbalanced NLP Tasks`

Dice Loss for NLP Tasks This repository contains code for Dice Loss for Data-imbalanced NLP Tasks at ACL2020. Setup Install Package Dependencies The c

223 Dec 17, 2022
Code for technical report "An Improved Baseline for Sentence-level Relation Extraction".

RE_improved_baseline Code for technical report "An Improved Baseline for Sentence-level Relation Extraction". Requirements torch = 1.8.1 transformers

Wenxuan Zhou 74 Nov 29, 2022
Unofficial PyTorch implementation of Guided Dropout

Unofficial PyTorch implementation of Guided Dropout This is a simple implementation of Guided Dropout for research. We try to reproduce the algorithm

2 Jan 07, 2022
Interpretable-contrastive-word-mover-s-embedding

Interpretable-contrastive-word-mover-s-embedding Paper Datasets Here is a Dropbox link to the datasets used in the paper: https://www.dropbox.com/sh/n

0 Nov 02, 2021
A Deep Learning based project for creating line art portraits.

ArtLine The main aim of the project is to create amazing line art portraits. Sounds Intresting,let's get to the pictures!! Model-(Smooth) Model-(Quali

Vijish Madhavan 3.3k Jan 07, 2023