RL Algorithms with examples in Python / Pytorch / Unity ML agents

Overview

Reinforcement Learning Project

This project was created to make it easier to get started with Reinforcement Learning. It now contains:

Getting Started

Install Basic Dependencies

To set up your python environment to run the code in the notebooks, follow the instructions below.

  • If you're on Windows I recommend installing Miniforge. It's a minimal installer for Conda. I also recommend using the Mamba package manager instead of Conda. It works almost the same as Conda, but only faster. There's a cheatsheet of Conda commands which also work in Mamba. To install Mamba, use this command:
conda install mamba -n base -c conda-forge 
  • Create (and activate) a new environment with Python 3.6 or later. I recommend using Python 3.9:

    • Linux or Mac:
    mamba create --name rl39 python=3.9 numpy
    source activate rl39
    • Windows:
    mamba create --name rl39 python=3.9 numpy
    activate rl39
  • Install PyTorch by following instructions on Pytorch.org. For example, to install PyTorch on Windows with GPU support, use this command:

mamba install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch
  • Install additional packages:
mamba install jupyter notebook matplotlib
python -m ipykernel install --user --name rl39 --display-name "rl39"
  • Change the kernel to match the rl39 environment by using the drop-down menu Kernel -> Change kernel inside Jupyter Notebook.

Install Unity Machine Learning Agents

Note: In order to run the notebooks on Windows, it's not necessary to install the Unity Editor, because I have provided the standalone executables of the environments for you.

Unity ML Agents is the software that we use for the environments. The agents that we create in Python can interact with these environments. Unity ML Agents consists of several parts:

  • The Unity Editor is used for creating environments. To install:

    • Install Unity Hub.
    • Install the latest version of Unity by clicking on the green button Unity Hub on the download page.

    To start the Unity editor you must first have a project:

    • Start the Unity Hub.
    • Click on "Projects"
    • Create a new dummy project.
    • Click on the project you've just added in the Unity Hub. The Unity Editor should start now.
  • The Unity ML-Agents Toolkit. Download the latest release of the source code or use the Git command: git clone --branch release_18 https://github.com/Unity-Technologies/ml-agents.git.

  • The Unity ML Agents package is used inside the Unity Editor. Please read the instructions for installation.

  • The mlagents Python package is used as a bridge between Python and the Unity editor (or standalone executable). To install, use this command: python -m pip install mlagents==0.27.0. Please note that there's no conda package available for this.

Install an IDE for Python

For Windows, I would recommend using PyCharm (my choice), or Visual Studio Code. Inside those IDEs you can use the Conda environment you have just created.

Creating a custom Unity executable

Load the examples project

The Unity ML-Agents Toolkit contains several example environments. Here we will load them all inside the Unity editor:

  • Start the Unity Hub.
  • Click on "Projects"
  • Add a project by navigating to the Project folder inside the toolkit.
  • Click on the project you've just added in the Unity Hub. The Unity Editor should start now.

Create a 3D Ball executable

The 3D Ball example contains 12 environments in one, but this doesn't work very well in the Python API. The main problem is that there's no way to reset each environment individually. Therefore, we will remove the other 11 environments in the editor:

  • Load the 3D Ball scene, by going to the project window and navigating to Examples -> 3DBall -> Scenes-> 3DBall
  • In the Hierarchy window select the other 11 3DBall objects and delete them, so that only the 3DBall object remains.

Next, we will build the executable:

  • Go to File -> Build Settings
  • In the Build Settings window, click Build
  • Navigate to notebooks folder and add 3DBall to the folder name that is used for the build.

Instructions for running the notebooks

  1. Download the Unity executables for Windows. In case you're not on Windows, you have to build the executables yourself by following the instructions above.
  2. Place the Unity executable folders in the same folder as the notebooks.
  3. Load a notebook with Jupyter notebook. (The command to start Jupyter notebook is jupyter notebook)
  4. Follow further instructions in the notebook.
You might also like...
An example project demonstrating how the Autonomous Learning Library can be used to build new reinforcement learning agents.
An example project demonstrating how the Autonomous Learning Library can be used to build new reinforcement learning agents.

About This repository shows how Autonomous Learning Library can be used to build new reinforcement learning agents. In particular, it contains a model

​TextWorld is a sandbox learning environment for the training and evaluation of reinforcement learning (RL) agents on text-based games.

TextWorld A text-based game generator and extensible sandbox learning environment for training and testing reinforcement learning (RL) agents. Also ch

Pacman-AI - AI project designed by UC Berkeley. Designed reflex and minimax agents for the game Pacman.
Pacman-AI - AI project designed by UC Berkeley. Designed reflex and minimax agents for the game Pacman.

Pacman AI Jussi Doherty CAP 4601 - Introduction to Artificial Intelligence - Fall 2020 Python version 3.0+ Source of this project This repo contains a

Scripts of Machine Learning Algorithms from Scratch. Implementations of machine learning models and algorithms using nothing but NumPy with a focus on accessibility. Aims to cover everything from basic to advance.
Scripts of Machine Learning Algorithms from Scratch. Implementations of machine learning models and algorithms using nothing but NumPy with a focus on accessibility. Aims to cover everything from basic to advance.

Algo-ScriptML Python implementations of some of the fundamental Machine Learning models and algorithms from scratch. The goal of this project is not t

Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-based network. Including examples for DETR, VQA.
Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-based network. Including examples for DETR, VQA.

PyTorch Implementation of Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers 1 Using Colab Please notic

PyTorch implementation of Advantage async actor-critic Algorithms (A3C) in PyTorch
PyTorch implementation of Advantage async actor-critic Algorithms (A3C) in PyTorch

Advantage async actor-critic Algorithms (A3C) in PyTorch @inproceedings{mnih2016asynchronous, title={Asynchronous methods for deep reinforcement lea

TensorRT examples (Jetson, Python/C++)(object detection)
TensorRT examples (Jetson, Python/C++)(object detection)

TensorRT examples (Jetson, Python/C++)(object detection)

Hi Guys, here I am providing examples, which will help you in Lerarning Python

LearningPython Hi guys, here I am trying to include as many practice examples of Python Language, as i Myself learn, and hope these will help you in t

Releases(v1.0.0)
Owner
Rogier Wachters
Rogier Wachters
ADB-IP-ROTATION - Use your mobile phone to gain a temporary IP address using ADB and data tethering

ADB IP ROTATE This an Python script based on Android Debug Bridge (adb) shell sc

Dor Bismuth 2 Jul 12, 2022
Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning

Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning This is the official repository for Conservative and Adaptive Penalty fo

7 Nov 22, 2022
Code release for the ICML 2021 paper "PixelTransformer: Sample Conditioned Signal Generation".

PixelTransformer Code release for the ICML 2021 paper "PixelTransformer: Sample Conditioned Signal Generation". Project Page Installation Please insta

Shubham Tulsiani 24 Dec 17, 2022
Bridging Composite and Real: Towards End-to-end Deep Image Matting

Bridging Composite and Real: Towards End-to-end Deep Image Matting Please note that the official repository of the paper Bridging Composite and Real:

Jizhizi_Li 30 Oct 31, 2022
Machine-in-the-Loop Rewriting for Creative Image Captioning

Machine-in-the-Loop Rewriting for Creative Image Captioning Data Annotated sources of data used in the paper: Data Source URL Mohammed et al. Link Gor

Vishakh P 6 Jul 24, 2022
⚡️Optimizing einsum functions in NumPy, Tensorflow, Dask, and more with contraction order optimization.

Optimized Einsum Optimized Einsum: A tensor contraction order optimizer Optimized einsum can significantly reduce the overall execution time of einsum

Daniel Smith 653 Dec 30, 2022
This is the workbook I created while I was studying for the Qiskit Associate Developer exam. I hope this becomes useful to others as it was for me :)

A Workbook for the Qiskit Developer Certification Exam Hello everyone! This is Bartu, a fellow Qiskitter. I have recently taken the Certification exam

Bartu Bisgin 66 Dec 10, 2022
Code for IntraQ, PyTorch implementation of our paper under review

IntraQ: Learning Synthetic Images with Intra-Class Heterogeneity for Zero-Shot Network Quantization paper Requirements Python = 3.7.10 Pytorch == 1.7

1 Nov 19, 2021
Pytorch implementation of 'Fingerprint Presentation Attack Detector Using Global-Local Model'

RTK-PAD This is an official pytorch implementation of 'Fingerprint Presentation Attack Detector Using Global-Local Model', which is accepted by IEEE T

6 Aug 01, 2022
Code for testing various M1 Chip benchmarks with TensorFlow.

M1, M1 Pro, M1 Max Machine Learning Speed Test Comparison This repo contains some sample code to benchmark the new M1 MacBooks (M1 Pro and M1 Max) aga

Daniel Bourke 348 Jan 04, 2023
A collection of easy-to-use, ready-to-use, interesting deep neural network models

Interesting and reproducible research works should be conserved. This repository wraps a collection of deep neural network models into a simple and un

Aria Ghora Prabono 16 Jun 16, 2022
BarcodeRattler - A Raspberry Pi Powered Barcode Reader to load a game on the Mister FPGA using MBC

Barcode Rattler A Raspberry Pi Powered Barcode Reader to load a game on the Mist

Chrissy 29 Oct 31, 2022
A strongly-typed genetic programming framework for Python

monkeys "If an army of monkeys were strumming on typewriters they might write all the books in the British Museum." monkeys is a framework designed to

H. Chase Stevens 115 Nov 27, 2022
Patches desktop steam to look like the new steamdeck ui.

steam_deck_ui_patch The Deck UI patch will patch the regular desktop steam to look like the brand new SteamDeck UI. This patch tool currently works on

The_IT_Dude 3 Aug 29, 2022
Pytorch Implementation of LNSNet for Superpixel Segmentation

LNSNet Overview Official implementation of Learning the Superpixel in a Non-iterative and Lifelong Manner (CVPR'21) Learning Strategy The proposed LNS

42 Oct 11, 2022
A Semantic Segmentation Network for Urban-Scale Building Footprint Extraction Using RGB Satellite Imagery

A Semantic Segmentation Network for Urban-Scale Building Footprint Extraction Using RGB Satellite Imagery This repository is the official implementati

Aatif Jiwani 42 Dec 08, 2022
FedGS: A Federated Group Synchronization Framework Implemented by LEAF-MX.

FedGS: Data Heterogeneity-Robust Federated Learning via Group Client Selection in Industrial IoT Preparation For instructions on generating data, plea

Lizonghang 9 Dec 22, 2022
Air Quality Prediction Using LSTM

AirQualityPredictionUsingLSTM In this Repo, i present to you the winning solution of smart gujarat hackathon 2019 where the task was to predict the qu

Deepak Nandwani 2 Dec 13, 2022
Keep CALM and Improve Visual Feature Attribution

Keep CALM and Improve Visual Feature Attribution Jae Myung Kim1*, Junsuk Choe1*, Zeynep Akata2, Seong Joon Oh1† * Equal contribution † Corresponding a

NAVER AI 90 Dec 07, 2022
Show Me the Whole World: Towards Entire Item Space Exploration for Interactive Personalized Recommendations

HierarchicyBandit Introduction This is the implementation of WSDM 2022 paper : Show Me the Whole World: Towards Entire Item Space Exploration for Inte

yu song 5 Sep 09, 2022