Policy Gradient Algorithms (One Step Actor Critic & PPO) from scratch using Numpy

Overview

Policy Gradient Algorithms From Scratch (NumPy)

This repository showcases two policy gradient algorithms (One Step Actor Critic and Proximal Policy Optimization) applied to two MDPs. The algorithms are implemented from scratch with Numpy and utilize linear regression for the value function and single layer Softmax for the policy. The MDPs are: Gridworld and Mountain Car.

Run Instructions

Packages:

numpy and matplotlib

Create virtual environment, install requirements and run: (windows instructions)

  1. Run python -m venv venv
  2. Run .\venv\Scripts\activate (windows)
  3. Run pip install -r requirements.txt
  4. Run python .\experiments.py be wary of long compute times and plots that will pop up and must be exited in order to comtinue.

Some Sample Plots

Files

  • experiments.py - Runs pre programmed experiments that output various plots both in the terminal and saved to .png files.
  • mdp.py - Contains two MDP domains: Gridworld and Mountain Car, that the experiments are run on.
  • models.py - Contains ValueFunction and Policy which are the two models used (linear layers) for function approximation by the algorithms.
  • policy_gradient_algorithms.py - Contains the policy gradient algorithms One Step Actor Critic and Proximal Policy Optimization (PPO).

MIT License

Wordle-solver - A program that solves a Wordle using a simple algorithm

Wordle Solver A program that solves a Wordle using a simple algorithm. To see it

Luc Bouchard 3 Feb 13, 2022
Repository for data structure and algorithms in Python for coding interviews

Python Data Structures and Algorithms This repository contains questions requiring implementation of data structures and algorithms concepts. It is us

Prabhu Pant 1.9k Jan 01, 2023
Our implementation of Gillespie's Stochastic Simulation Algorithm (SSA)

SSA Our implementation of Gillespie's Stochastic Simulation Algorithm (SSA) Requirements python =3.7 numpy pandas matplotlib pyyaml Command line usag

Anoop Lab 1 Jan 27, 2022
Classic algorithms including Fizz Buzz, Bubble Sort, the Fibonacci Sequence, a Sudoku solver, and more.

Algorithms Classic algorithms including Fizz Buzz, Bubble Sort, the Fibonacci Sequence, a Sudoku solver, and more. Algorithm Complexity Time and Space

1 Jan 14, 2022
This is a Python implementation of the HMRF algorithm on networks with categorial variables.

Salad Salad is an Open Source Python library to segment tissues into different biologically relevant regions based on Hidden Markov Random Fields. The

1 Nov 16, 2021
My own Unicode compression algorithm

Zee Code ZCode is a custom compression algorithm I originally developed for a competition held for the Spring 2019 Datastructures and Algorithms cours

Vahid Zehtab 2 Oct 20, 2021
A fast, pure python implementation of the MuyGPs Gaussian process realization and training algorithm.

Fast implementation of the MuyGPs Gaussian process hyperparameter estimation algorithm MuyGPs is a GP estimation method that affords fast hyperparamet

Lawrence Livermore National Laboratory 13 Dec 02, 2022
A genetic algorithm written in Python for educational purposes.

Genea: A Genetic Algorithm in Python Genea is a Genetic Algorithm written in Python, for educational purposes. I started writing it for fun, while lea

Dom De Felice 20 Jul 06, 2022
A command line tool for memorizing algorithms in Python by typing them.

Algo Drills A command line tool for memorizing algorithms in Python by typing them. In alpha and things will change. How it works Type out an algorith

Travis Jungroth 43 Dec 02, 2022
A collection of Python Scripts made for fun, while exploring Python 🐍

JFF-Python-Scripts A collection of Python Scripts made for fun, while exploring Python 🐍 Inspiration 💡 Many of the programs in this repository are i

Pushkar Patel 16 Oct 07, 2022
PICO is an algorithm for exploiting Reinforcement Learning (RL) on Multi-agent Path Finding tasks.

PICO is an algorithm for exploiting Reinforcement Learning (RL) on Multi-agent Path Finding tasks. It is developed by the Multi-Agent Artificial Intel

21 Dec 20, 2022
RRT algorithm and its optimization

RRT-Algorithm-Visualisation This is a project that aims to develop upon the RRT

Sarannya Bhattacharya 7 Mar 06, 2022
Python Client for Algorithmia Algorithms and Data API

Algorithmia Common Library (python) Python client library for accessing the Algorithmia API For API documentation, see the PythonDocs Algorithm Develo

Algorithmia 138 Oct 26, 2022
Benchmark for Robustness Tests of Control Alrogithms

A gym-like classical control benchmark for evaluating the robustnesses of control and reinforcement learning algorithms.

Kim Taekyung 4 Jan 18, 2022
Repository for Comparison based sorting algorithms in python

Repository for Comparison based sorting algorithms in python. This was implemented for project one submission for ITCS 6114 Data Structures and Algorithms under the guidance of Dr. Dewan at the Unive

Devashri Khagesh Gadgil 1 Dec 20, 2021
Algorithms written in different programming languages

Data Structures and Algorithms Clean example implementations of data structures and algorithms written in different languages. List of implementations

Zoran Pandovski 1.3k Jan 03, 2023
frePPLe - open source supply chain planning

frePPLe Open source supply chain planning FrePPLe is an easy-to-use and easy-to-implement open source advanced planning and scheduling tool for manufa

frePPLe 385 Jan 06, 2023
Optimal skincare partition finder using graph theory

Pigment The problem of partitioning up a skincare regime into parts such that each part does not interfere with itself is equivalent to the minimal cl

Jason Nguyen 1 Nov 22, 2021
Silver Trading Algorithm

Silver Trading Algorithm This project was done in the context of the Applied Algorithm Trading Course (FINM 35910) at the University of Chicago. Motiv

Laurent Lanteigne 1 Jan 29, 2022
Algorithm for Cutting Stock Problem using Google OR-Tools. Link to the tool:

Cutting Stock Problem Cutting Stock Problem (CSP) deals with planning the cutting of items (rods / sheets) from given stock items (which are usually o

Emad Ehsan 87 Dec 31, 2022