Bag of Tricks for Natural Policy Gradient Reinforcement Learning [ArXiv]

Setup

Python 3.8.0
pip install -r req.txt
Mujoco 200 license

Main Files

main.py: main run file for model training
models.py: neural networks for policy and critic models
optim.py: second-order approximations for realizing the natural gradient
utils.py: helper functions

Reproducing Experiments

scripts/: bash training scripts formatted for compute canada/SLURM jobs
visualize/json: training hyperparameters for each experiment
visualize/csv: training results in .csv format
visualize/performance.py: (after training) view results & create .csv results
- best to run with VSCode ipython cells

Experiment Example

To run the baseline experiments:

Tune hparams: bash scripts/hparams/baseline.sh
- runs will be saved in runs/hparams_baseline/...
Extract best hparams from runs: python baseline_hparams.py
- the best hparams will be saved in visualize/json/baseline.json
Run training with hparams: bash scripts/baseline/diagonal.sh
- runs will be saved in runs/5e6_baseline/...
Run speed tests: bash scripts/speed/baseline.sh
- runs will be saved in runs/baseline_speed/...
View results: run interactive ipython in visualize/performance.py

# %%
runs_path = pathlib.Path("../runs/5e6_baseline/")
speed_runs_path = pathlib.Path("../runs/baseline_speed/")
name = "baseline"
baseline_data = analyze(runs_path, speed_runs_path)
baseline_df = mean_df(*baseline_data, name, save=True)

Second-order Approximation References

Implementations

Other

Code formatted with Black
Experiment runs format: runs/{experiment_name}/{env_name}/{approximation}_runs/{tensorboard folder}/...

Bag of Tricks for Natural Policy Gradient Reinforcement Learning

Related tags

Overview

Bag of Tricks for Natural Policy Gradient Reinforcement Learning [ArXiv]

Setup

Main Files

Reproducing Experiments

Experiment Example

Second-order Approximation References

Implementations

Other

Owner

Brennan Gebotys

AOT-GAN for High-Resolution Image Inpainting (codebase for image inpainting)

Pairwise Learning for Neural Link Prediction for OGB (PLNLP-OGB)

Official Code for AdvRush: Searching for Adversarially Robust Neural Architectures (ICCV '21)

EM-POSE 3D Human Pose Estimation from Sparse Electromagnetic Trackers.

code from "Tensor decomposition of higher-order correlations by nonlinear Hebbian plasticity"

Spectrum is an AI that uses machine learning to generate Rap song lyrics

Unofficial implementation (replicates paper results!) of MINER: Multiscale Implicit Neural Representations in pytorch-lightning

PyTorch implementation of ENet

PyGCL: Graph Contrastive Learning Library for PyTorch

lightweight python wrapper for vowpal wabbit

The implementation of the CVPR2021 paper "Structure-Aware Face Clustering on a Large-Scale Graph with 10^7 Nodes"

Python implementation of the multistate Bennett acceptance ratio (MBAR)

Binary Stochastic Neurons in PyTorch

This repo contains implementation of different architectures for emotion recognition in conversations.

A dataset for online Arabic calligraphy

PyTorch implementation(s) of various ResNet models from Twitch streams.

Code implementation for the paper 'Conditional Gaussian PAC-Bayes'.

Simple tool to combine(merge) onnx models. Simple Network Combine Tool for ONNX.

TensorFlow implementation of ENet