A working implementation of the Categorical DQN (Distributional RL).

Last update: Sep 20, 2022

Overview

Categorical DQN.

Implementation of the Categorical DQN as described in A distributional Perspective on Reinforcement Learning.

Thanks to @tudor-berariu for optimisation and training tricks and for catching two nasty bugs.

Dependencies

You can take a look in the env export file for the full list of dependencies.

Install the game of Catch:

git clone https://github.com/floringogianu/gym_fast_envs
cd gym_fast_envs

pip install -r requirements.txt
pip install -e .

Install visdom for reporting: pip install visdom.

Training

First start the visdom server: python -m visdom.server. If you don't want to install or use visdom make sure you deactivate the display_plots option in the configs.

Train the Categorical DQN with python main.py -cf configs/catch_categorical.yaml.

Train a DQN baseline with python main.py -cf configs/catch_dqn.yaml.

To Do

Migrate to Pytorch 0.2.0. Breaks compatibility with 0.1.12.
Add some training curves.
Run on Atari.
Add proper evaluation.

Results

First row is with batch size of 64, the second with 32. Will run on more seeds and average for a better comparison. Working on adding Atari results.

A working implementation of the Categorical DQN (Distributional RL).

Related tags

Overview

Categorical DQN.

Dependencies

Training

To Do

Results

Owner

Florin Gogianu

A PyTorch implementation of Radio Transformer Networks from the paper "An Introduction to Deep Learning for the Physical Layer".

Tutorial page of the Climate Hack, the greatest hackathon ever

Training PSPNet in Tensorflow. Reproduce the performance from the paper.

MODALS: Modality-agnostic Automated Data Augmentation in the Latent Space

Tensorflow implementation of DeepLabv2

A modular active learning framework for Python

Collapse by Conditioning: Training Class-conditional GANs with Limited Data

Code Repo for the ACL21 paper "Common Sense Beyond English: Evaluating and Improving Multilingual LMs for Commonsense Reasoning"

SIR model parameter estimation using a novel algorithm for differentiated uniformization.

Semi-Supervised 3D Hand-Object Poses Estimation with Interactions in Time

BC3407-Group-5-Project - BC3407 Group Project With Python

[CVPR2022] Bridge-Prompt: Towards Ordinal Action Understanding in Instructional Videos

RobustVideoMatting and background composing in one model by using onnxruntime.

Codes for the AAAI'22 paper "TransZero: Attribute-guided Transformer for Zero-Shot Learning"

Open-Ended Commonsense Reasoning (NAACL 2021)

A demonstration of using a live Tensorflow session to create an interactive face-GAN explorer.

Official repository for the CVPR 2021 paper "Learning Feature Aggregation for Deep 3D Morphable Models"

Rendering color and depth images for ShapeNet models.

A Blender python script for getting asset browser custom preview images for objects and collections.

Contrastive Feature Loss for Image Prediction