Bag of Tricks for Natural Policy Gradient Reinforcement Learning

Overview

Bag of Tricks for Natural Policy Gradient Reinforcement Learning [ArXiv]

Setup

  • Python 3.8.0
  • pip install -r req.txt
  • Mujoco 200 license

Main Files

  • main.py: main run file for model training
  • models.py: neural networks for policy and critic models
  • optim.py: second-order approximations for realizing the natural gradient
  • utils.py: helper functions

Reproducing Experiments

  • scripts/: bash training scripts formatted for compute canada/SLURM jobs
  • visualize/json: training hyperparameters for each experiment
  • visualize/csv: training results in .csv format
  • visualize/performance.py: (after training) view results & create .csv results
    • best to run with VSCode ipython cells

Experiment Example

To run the baseline experiments:

  • Tune hparams: bash scripts/hparams/baseline.sh
    • runs will be saved in runs/hparams_baseline/...
  • Extract best hparams from runs: python baseline_hparams.py
    • the best hparams will be saved in visualize/json/baseline.json
  • Run training with hparams: bash scripts/baseline/diagonal.sh
    • runs will be saved in runs/5e6_baseline/...
  • Run speed tests: bash scripts/speed/baseline.sh
    • runs will be saved in runs/baseline_speed/...
  • View results: run interactive ipython in visualize/performance.py
# %%
runs_path = pathlib.Path("../runs/5e6_baseline/")
speed_runs_path = pathlib.Path("../runs/baseline_speed/")
name = "baseline"
baseline_data = analyze(runs_path, speed_runs_path)
baseline_df = mean_df(*baseline_data, name, save=True)

Second-order Approximation References

Implementations

Other

  • Code formatted with Black
  • Experiment runs format: runs/{experiment_name}/{env_name}/{approximation}_runs/{tensorboard folder}/...
Owner
Brennan Gebotys
Brennan Gebotys
AOT-GAN for High-Resolution Image Inpainting (codebase for image inpainting)

AOT-GAN for High-Resolution Image Inpainting Arxiv Paper | AOT-GAN: Aggregated Contextual Transformations for High-Resolution Image Inpainting Yanhong

Multimedia Research 214 Jan 03, 2023
Pairwise Learning for Neural Link Prediction for OGB (PLNLP-OGB)

Pairwise Learning for Neural Link Prediction for OGB (PLNLP-OGB) This repository provides evaluation codes of PLNLP for OGB link property prediction t

Zhitao WANG 31 Oct 10, 2022
Official Code for AdvRush: Searching for Adversarially Robust Neural Architectures (ICCV '21)

AdvRush Official Code for AdvRush: Searching for Adversarially Robust Neural Architectures (ICCV '21) Environmental Set-up Python == 3.6.12, PyTorch =

11 Dec 10, 2022
EM-POSE 3D Human Pose Estimation from Sparse Electromagnetic Trackers.

EM-POSE: 3D Human Pose Estimation from Sparse Electromagnetic Trackers This repository contains the code to our paper published at ICCV 2021. For ques

Facebook Research 62 Dec 14, 2022
code from "Tensor decomposition of higher-order correlations by nonlinear Hebbian plasticity"

Code associated with the paper "Tensor decomposition of higher-order correlations by nonlinear Hebbian learning," Ocker & Buice, Neurips 2021. "plot_f

Gabriel Koch Ocker 4 Oct 16, 2022
Spectrum is an AI that uses machine learning to generate Rap song lyrics

Spectrum Spectrum is an AI that uses deep learning to generate rap song lyrics. View Demo Report Bug Request Feature Open In Colab About The Project S

39 Dec 16, 2022
Unofficial implementation (replicates paper results!) of MINER: Multiscale Implicit Neural Representations in pytorch-lightning

MINER_pl Unofficial implementation of MINER: Multiscale Implicit Neural Representations in pytorch-lightning. 📖 Ref readings Laplacian pyramid explan

AI葵 51 Nov 28, 2022
PyTorch implementation of ENet

PyTorch-ENet PyTorch (v1.1.0) implementation of ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation, ported from the lua-torc

David Silva 333 Dec 29, 2022
PyGCL: Graph Contrastive Learning Library for PyTorch

PyGCL: Graph Contrastive Learning for PyTorch PyGCL is an open-source library for graph contrastive learning (GCL), which features modularized GCL com

GCL: Graph Contrastive Learning Library for PyTorch 594 Jan 08, 2023
lightweight python wrapper for vowpal wabbit

vowpal_porpoise Lightweight python wrapper for vowpal_wabbit. Why: Scalable, blazingly fast machine learning. Install Install vowpal_wabbit. Clone and

Joseph Reisinger 163 Nov 24, 2022
The implementation of the CVPR2021 paper "Structure-Aware Face Clustering on a Large-Scale Graph with 10^7 Nodes"

STAR-FC This code is the implementation for the CVPR 2021 paper "Structure-Aware Face Clustering on a Large-Scale Graph with 10^7 Nodes" 🌟 🌟 . 🎓 Re

Shuai Shen 87 Dec 28, 2022
Python implementation of the multistate Bennett acceptance ratio (MBAR)

pymbar Python implementation of the multistate Bennett acceptance ratio (MBAR) method for estimating expectations and free energy differences from equ

Chodera lab // Memorial Sloan Kettering Cancer Center 169 Dec 02, 2022
Binary Stochastic Neurons in PyTorch

Binary Stochastic Neurons in PyTorch http://r2rt.com/binary-stochastic-neurons-in-tensorflow.html https://github.com/pytorch/examples/tree/master/mnis

Onur Kaplan 54 Nov 21, 2022
This repo contains implementation of different architectures for emotion recognition in conversations.

Emotion Recognition in Conversations Updates 🔥 🔥 🔥 Date Announcements 03/08/2021 🎆 🎆 We have released a new dataset M2H2: A Multimodal Multiparty

Deep Cognition and Language Research (DeCLaRe) Lab 1k Dec 30, 2022
A dataset for online Arabic calligraphy

Calliar Calliar is a dataset for Arabic calligraphy. The dataset consists of 2500 json files that contain strokes manually annotated for Arabic callig

ARBML 114 Dec 28, 2022
PyTorch implementation(s) of various ResNet models from Twitch streams.

pytorch-resnet-twitch PyTorch implementation(s) of various ResNet models from Twitch streams. Status: ResNet50 currently not working. Will update in n

Daniel Bourke 3 Jan 11, 2022
Code implementation for the paper 'Conditional Gaussian PAC-Bayes'.

CondGauss This repository contains PyTorch code for the paper Stochastic Gaussian PAC-Bayes. A novel PAC-Bayesian training method is implemented. Ther

0 Nov 01, 2021
Simple tool to combine(merge) onnx models. Simple Network Combine Tool for ONNX.

snc4onnx Simple tool to combine(merge) onnx models. Simple Network Combine Tool for ONNX. https://github.com/PINTO0309/simple-onnx-processing-tools 1.

Katsuya Hyodo 8 Oct 13, 2022
TensorFlow implementation of ENet

TensorFlow-ENet TensorFlow implementation of ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation. This model was tested on th

Kwotsin 255 Oct 17, 2022