A collection of Reinforcement Learning algorithms from Sutton and Barto's book and other research papers implemented in Python.

Last update: Dec 28, 2022

Overview

Reinforcement-Learning-Notebooks

A collection of Reinforcement Learning algorithms from Sutton and Barto's book and other research papers implemented in Python.

I wrote these notebooks in March 2017 while I took the COMP 767: Reinforcement Learning [5] class by Prof. Doina Precup at McGill, Montréal. I highly recommend you to go through the class notes and references of all the papers the intructors have posted on the website.

These notebooks should be used while you read the book and go beyond the same with the referenced papers. I would suggest watching David Silver's videos and reading the book simultaneously. And when you are done with a few chapters, start implementing them. The algorithms follow a pattern and mostly are variants of each other. I have tried my best to explain each notebook's results and possible future directions.

Disclaimer: The code is a little messy. I'd written this when I was not a Pythonista. If you would like to clean them up and want to make it into a nice interface, feel free to contact me. I will be very pleased to collaborate. If you use them then please cite the source and also mention the credits as listed below. Also, email me with ways to improve, let me know if you find any bugs.

Feel free to reach me at [email protected] or see my website here

Special Credits:

[1] Denny Britz

[2] Monica Patel

[3] Sutton and Barto

[4] David Silver

[5] Doina Precup's course

A collection of Reinforcement Learning algorithms from Sutton and Barto's book and other research papers implemented in Python.

Related tags

Overview

Reinforcement-Learning-Notebooks

A collection of Reinforcement Learning algorithms from Sutton and Barto's book and other research papers implemented in Python.

Owner

Pulkit Khandelwal

MegEngine implementation of YOLOX

Raster Vision is an open source Python framework for building computer vision models on satellite, aerial, and other large imagery sets

The description of FMFCC-A (audio track of FMFCC) dataset and Challenge resluts.

Re-TACRED: Addressing Shortcomings of the TACRED Dataset

[CVPR 2021] Few-shot 3D Point Cloud Semantic Segmentation

Repository accompanying the "Sign Pose-based Transformer for Word-level Sign Language Recognition" paper

Deep Multi-Magnification Network for multi-class tissue segmentation of whole slide images

FedTorch is an open-source Python package for distributed and federated training of machine learning models using PyTorch distributed API

ShuttleNet: Position-aware Fusion of Rally Progress and Player Styles for Stroke Forecasting in Badminton (AAAI'22)

A minimal yet resourceful implementation of diffusion models (along with pretrained models + synthetic images for nine datasets)

🐦 Opytimizer is a Python library consisting of meta-heuristic optimization techniques.

Python package for downloading ECMWF reanalysis data and converting it into a time series format.

Log4j JNDI inj. vuln scanner

Code base for the paper "Scalable One-Pass Optimisation of High-Dimensional Weight-Update Hyperparameters by Implicit Differentiation"

Prompt Tuning with Rules

A Tensorflow implementation of CapsNet based on Geoffrey Hinton's paper Dynamic Routing Between Capsules

Code release for the paper “Worldsheet Wrapping the World in a 3D Sheet for View Synthesis from a Single Image”, ICCV 2021.

Efficient 3D Backbone Network for Temporal Modeling

CVPR 2021: "Generating Diverse Structure for Image Inpainting With Hierarchical VQ-VAE"

Methods to get the probability of a changepoint in a time series.