Policy Gradient Algorithms (One Step Actor Critic & PPO) from scratch using Numpy

Overview

Policy Gradient Algorithms From Scratch (NumPy)

This repository showcases two policy gradient algorithms (One Step Actor Critic and Proximal Policy Optimization) applied to two MDPs. The algorithms are implemented from scratch with Numpy and utilize linear regression for the value function and single layer Softmax for the policy. The MDPs are: Gridworld and Mountain Car.

Run Instructions

Packages:

numpy and matplotlib

Create virtual environment, install requirements and run: (windows instructions)

  1. Run python -m venv venv
  2. Run .\venv\Scripts\activate (windows)
  3. Run pip install -r requirements.txt
  4. Run python .\experiments.py be wary of long compute times and plots that will pop up and must be exited in order to comtinue.

Some Sample Plots

Files

  • experiments.py - Runs pre programmed experiments that output various plots both in the terminal and saved to .png files.
  • mdp.py - Contains two MDP domains: Gridworld and Mountain Car, that the experiments are run on.
  • models.py - Contains ValueFunction and Policy which are the two models used (linear layers) for function approximation by the algorithms.
  • policy_gradient_algorithms.py - Contains the policy gradient algorithms One Step Actor Critic and Proximal Policy Optimization (PPO).

MIT License

Planning Algorithms in AI and Robotics. MSc course at Skoltech Data Science program

Planning Algorithms in AI and Robotics course T2 2021-22 The Planning Algorithms in AI and Robotics course at Skoltech, MS in Data Science, during T2,

Mobile Robotics Lab. at Skoltech 6 Sep 21, 2022
This is a Python implementation of the HMRF algorithm on networks with categorial variables.

Salad Salad is an Open Source Python library to segment tissues into different biologically relevant regions based on Hidden Markov Random Fields. The

1 Nov 16, 2021
Genetic algorithm which evolves aoe2 DE ai scripts

AlphaScripter Use the power of genetic algorithms to evolve AI scripts for Age of Empires II : Definitive Edition. For now this package runs in AOC Us

6 Nov 04, 2022
SortingAlgorithmVisualization - A place for me to learn about sorting algorithms

SortingAlgorithmVisualization A place for me to learn about sorting algorithms.

1 Jan 15, 2022
🧬 Performant Evolutionary Algorithms For Python with Ray support

🧬 Performant Evolutionary Algorithms For Python with Ray support

Nathan 49 Oct 20, 2022
A selection of a few algorithms used to sort or search an array

Sort and search algorithms This repository has some common search / sort algorithms written in python, I also included the pseudocode of each algorith

0 Apr 02, 2022
RRT algorithm and its optimization

RRT-Algorithm-Visualisation This is a project that aims to develop upon the RRT

Sarannya Bhattacharya 7 Mar 06, 2022
Optimal skincare partition finder using graph theory

Pigment The problem of partitioning up a skincare regime into parts such that each part does not interfere with itself is equivalent to the minimal cl

Jason Nguyen 1 Nov 22, 2021
The DarkRift2 networking framework written in Python 3

DarkRiftPy is Darkrift2 written in Python 3. The implementation is fully compatible with the original version. So you can write a client side on Python that connects to a Darkrift2 server written in

Anton Dobryakov 6 May 23, 2022
A Python implementation of Jerome Friedman's Multivariate Adaptive Regression Splines

py-earth A Python implementation of Jerome Friedman's Multivariate Adaptive Regression Splines algorithm, in the style of scikit-learn. The py-earth p

431 Dec 15, 2022
This repository is an individual project made at BME with the topic of self-driving car simulator and control algorithm.

BME individual project - NEAT based self-driving car This repository is an individual project made at BME with the topic of self-driving car simulator

NGO ANH TUAN 1 Dec 13, 2021
Given a list of tickers, this algorithm generates a recommended portfolio for high-risk investors.

RiskyPortfolioGenerator Given a list of tickers, this algorithm generates a recommended portfolio for high-risk investors. Working in a group, we crea

Victoria Zhao 2 Jan 13, 2022
Genius Square puzzle solver in Python

Genius Square puzzle solver in Python

James 3 Dec 15, 2022
Policy Gradient Algorithms (One Step Actor Critic & PPO) from scratch using Numpy

Policy Gradient Algorithms From Scratch (NumPy) This repository showcases two policy gradient algorithms (One Step Actor Critic and Proximal Policy Op

1 Jan 17, 2022
There are some basic arithmatic in Pattern Recognization and Machine Learning writed in Python in this repository

There are some basic arithmatic in Pattern Recognization and Machine Learning writed in Python in this repository

1 Nov 19, 2021
This repository explores an implementation of Grover's Algorithm for knights on a chessboard.

Grover Knights Welcome to my Knights project! Project Description: I explore an implementation of a quantum oracle for knights on a chessboard.

Will Sun 8 Feb 22, 2022
Xor encryption and decryption algorithm

Folosire: Pentru encriptare: python encrypt.py parola fișier pentru criptare fișier encriptat(de tip binar) Pentru decriptare: python decrypt.p

2 Dec 05, 2021
Implements (high-dimenstional) clustering algorithm

Description Implements (high-dimenstional) clustering algorithm described in https://arxiv.org/pdf/1804.02624.pdf Dependencies python3 pytorch (=0.4)

Eric Elmoznino 5 Dec 27, 2022
Python implementation of Aho-Corasick algorithm for string searching

Python implementation of Aho-Corasick algorithm for string searching

Daniel O'Sullivan 1 Dec 31, 2021
Implementation of an ordered dithering algorithm used in computer graphics

Ordered Dithering Project In this project, we use an ordered dithering method to turn an RGB image, first to a gray scale image and then, turn the gra

1 Oct 26, 2021