Code for the paper "Functional Regularization for Reinforcement Learning via Learned Fourier Features"

Overview

Reinforcement Learning with Learned Fourier Features

State-space Soft Actor-Critic Experiments

Move to the state-SAC-LFF repository.

cd state-SAC-LFF

To install the dependencies, use the provided environment.yml file

conda env create -f environment.yml

To run an experiment, the template for MLP and LFF experiments, respectively, are:

python main.py --policy PytorchSAC --env dm.quadruped.run --start_timesteps 5000 --hidden_dim 1024 --batch_size 1024 --n_hidden 3
python main.py --policy PytorchSAC --env dm.quadruped.run --start_timesteps 5000 --hidden_dim 1024 --batch_size 1024 --n_hidden 2 \
               --network_class FourierMLP --sigma 0.001 --fourier_dim 1024 --train_B --concatenate_fourier

The only thing that changes between the baseline is the number of hidden layers (we reduce by 1 to keep parameter count roughly the same), the network_class, the fourier_dim, sigma, train_B, and concatenate_fourier.

Image-space Soft Actor-Critic Experiments

Move to the image-SAC-LFF repository.

cd image-SAC-LFF

Install RAD dependencies:

conda env create -f conda_env.yml

To run an experiment, the template for CNN and CNN+LFF experiments, respectively, are:

python train.py --domain_name hopper --task_name hop --encoder_type fourier_pixel --action_repeat 4 \
                --num_eval_episodes 10 \--pre_transform_image_size 100 --image_size 84 --agent rad_sac \
                --frame_stack 3 --data_augs crop --critic_lr 1e-3 --actor_lr 1e-3 --eval_freq 10000 --batch_size 128 \
                --num_train_steps 1000000 --fourier_dim 128 --sigma 0.1 --train_B --concatenate_fourier
python train.py --domain_name hopper --task_name hop --encoder_type fair_pixel --action_repeat 4 \
                --num_eval_episodes 10 \--pre_transform_image_size 100 --image_size 84 --agent rad_sac \
                --frame_stack 3 --data_augs crop --critic_lr 1e-3 --actor_lr 1e-3 --eval_freq 10000 --batch_size 128 \
                --num_train_steps 1000000

Proximal Policy Optimization Experiments

Move to the state-PPO-LFF repository.

cd pytorch-a2c-ppo-acktr-gail

Install PPO dependencies:

conda env create -f environment.yml

To run an experiment, the template for MLP and LFF experiments, respectively, are:

python main.py --env-name Hopper-v2 --algo ppo --use-gae --log-interval 1 --num-steps 2048 --num-processes 1 \
               --lr 3e-4 --entropy-coef 0 --value-loss-coef 0.5 --ppo-epoch 10 --num-mini-batch 32 --gamma 0.99 \
               --gae-lambda 0.95 --num-env-steps 1000000 --use-linear-lr-decay --use-proper-time-limits \
               --hidden_dim 256 --network_class MLP --n_hidden 2 --seed 10
python main.py --env-name Hopper-v2 --algo ppo --use-gae --log-interval 1 --num-steps 2048 --num-processes 1 \
               --lr 3e-4 --entropy-coef 0 --value-loss-coef 0.5 --ppo-epoch 10 --num-mini-batch 32 --gamma 0.99 \
               --gae-lambda 0.95 --num-env-steps 1000000 --use-linear-lr-decay --use-proper-time-limits \
               --hidden_dim 256 --network_class FourierMLP --n_hidden 2 --sigma 0.01 --fourier_dim 64 \ 
               --concatenate_fourier --train_B --seed 10

Acknowledgements

We built the state-based SAC codebase off the TD3 repo by Fujimoto et al. We especially appreciated its lightweight bare-bones training loop. For the state-based SAC algorithm implementation and hyperparameters, we used this PyTorch SAC repo by Yarats and Kostrikov. For the SAC+RAD image-based experiments, we used the authors' implementation. Finally, we built off this PPO codebase by Ilya Kostrikov.

Owner
Alex Li
PhD student in machine learning at Carnegie Mellon University. Prev: undergrad at UC Berkeley.
Alex Li
Statistical and Algorithmic Investing Strategies for Everyone

Eiten - Algorithmic Investing Strategies for Everyone Eiten is an open source toolkit by Tradytics that implements various statistical and algorithmic

Tradytics 2.5k Jan 02, 2023
Cooperative Driving Dataset: a dataset for multi-agent driving scenarios

Cooperative Driving Dataset (CODD) The Cooperative Driving dataset is a synthetic dataset generated using CARLA that contains lidar data from multiple

Eduardo Henrique Arnold 124 Dec 28, 2022
Fast sparse deep learning on CPUs

SPARSEDNN **If you want to use this repo, please send me an email: [email pro

Ziheng Wang 44 Nov 30, 2022
Semi-supervised Semantic Segmentation with Directional Context-aware Consistency (CVPR 2021)

Semi-supervised Semantic Segmentation with Directional Context-aware Consistency (CAC) Xin Lai*, Zhuotao Tian*, Li Jiang, Shu Liu, Hengshuang Zhao, Li

Jia Research Lab 137 Dec 14, 2022
Multi-View Radar Semantic Segmentation

Multi-View Radar Semantic Segmentation Paper Multi-View Radar Semantic Segmentation, ICCV 2021. Arthur Ouaknine, Alasdair Newson, Patrick Pérez, Flore

valeo.ai 37 Oct 25, 2022
Creating Artificial Life with Reinforcement Learning

Although Evolutionary Algorithms have shown to result in interesting behavior, they focus on learning across generations whereas behavior could also be learned during ones lifetime.

Maarten Grootendorst 49 Dec 21, 2022
Convert dog pictures into various painting styles. Try LimnPet

LimnPet Cartoon stylization service project Try our service » Home page · Team notion · Members 목차 프로젝트 소개 프로젝트 목표 사용한 기술스택과 수행도구 팀원 구현 기능 주요 기능 추가 기능

LiJell 7 Jul 14, 2022
An Extendible (General) Continual Learning Framework based on Pytorch - official codebase of Dark Experience for General Continual Learning

Mammoth - An Extendible (General) Continual Learning Framework for Pytorch NEWS STAY TUNED: We are working on an update of this repository to include

AImageLab 277 Dec 28, 2022
Official Pytorch implementation of 6DRepNet: 6D Rotation representation for unconstrained head pose estimation.

6D Rotation Representation for Unconstrained Head Pose Estimation (Pytorch) Paper Thorsten Hempel and Ahmed A. Abdelrahman and Ayoub Al-Hamadi, "6D Ro

Thorsten Hempel 284 Dec 23, 2022
A framework for attentive explainable deep learning on tabular data

🧠 kendrite A framework for attentive explainable deep learning on tabular data 💨 Quick start kedro run 🧱 Built upon Technology Description Links ke

Marnix Koops 3 Nov 06, 2021
LieTransformer: Equivariant Self-Attention for Lie Groups

LieTransformer This repository contains the implementation of the LieTransformer used for experiments in the paper LieTransformer: Equivariant Self-At

OxCSML (Oxford Computational Statistics and Machine Learning) 50 Dec 28, 2022
This is a GUI interface which can process forest fire detection, smoke detection and fire segmentation

This is a GUI interface which can process forest fire detection, smoke detection and fire segmentation. Yolov5 is used to detect fire and smoke and unet is used to segment fire.

7 Jan 08, 2023
Clairvoyance: a Unified, End-to-End AutoML Pipeline for Medical Time Series

Clairvoyance: A Pipeline Toolkit for Medical Time Series Authors: van der Schaar Lab This repository contains implementations of Clairvoyance: A Pipel

van_der_Schaar \LAB 89 Dec 07, 2022
A system used to detect whether a person is wearing a medical mask or not.

Mask_Detection_System A system used to detect whether a person is wearing a medical mask or not. To open the program, please follow these steps: Make

Mohamed Emad 0 Nov 17, 2022
A2LP for short, ECCV2020 spotlight, Investigating SSL principles for UDA problems

Label-Propagation-with-Augmented-Anchors (A2LP) Official codes of the ECCV2020 spotlight (label propagation with augmented anchors: a simple semi-supe

20 Oct 27, 2022
[ICCV 2021] Released code for Causal Attention for Unbiased Visual Recognition

CaaM This repo contains the codes of training our CaaM on NICO/ImageNet9 dataset. Due to my recent limited bandwidth, this codebase is still messy, wh

Wang Tan 66 Dec 31, 2022
LQM - Improving Object Detection by Estimating Bounding Box Quality Accurately

Improving Object Detection by Estimating Bounding Box Quality Accurately Abstract Object detection aims to locate and classify object instances in ima

IM Lab., POSTECH 0 Sep 28, 2022
This is an official repository of CLGo: Learning to Predict 3D Lane Shape and Camera Pose from a Single Image via Geometry Constraints

CLGo This is an official repository of CLGo: Learning to Predict 3D Lane Shape and Camera Pose from a Single Image via Geometry Constraints An earlier

刘芮金 32 Dec 20, 2022
This PyTorch package implements MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided Adaptation (NAACL 2022).

MoEBERT This PyTorch package implements MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided Adaptation (NAACL 2022). Installation Create an

Simiao Zuo 34 Dec 24, 2022
CAMPARI: Camera-Aware Decomposed Generative Neural Radiance Fields

CAMPARI: Camera-Aware Decomposed Generative Neural Radiance Fields Paper | Supplementary | Video | Poster If you find our code or paper useful, please

26 Nov 29, 2022