Safe Policy Optimization with Local Features

Overview

Safe Policy Optimization with Local Feature (SPO-LF)

This is the source-code for implementing the algorithms in the paper "Safe Policy Optimization with Local Generalized Linear Function Approximations" which was presented in NeurIPS-21.

Installation

There is requirements.txt in this repository. Except for the common modules (e.g., numpy, scipy), our source code depends on the following modules.

We also provide Dockerfile in this repository, which can be used for reproducing our grid-world experiment.

Simulation configuration

We manage the simulation configuration using hydra. Configurations are listed in config.yaml. For example, the algorithm to run should be chosen from the ones we implemented:

sim_type: {safe_glm, unsafe_glm, random, oracle, safe_gp_state, safe_gp_feature, safe_glm_stepwise}

Grid World Experiment

The source code necessary for our grid-world experiment is contained in /grid_world folder. To run the simulation, for example, use the following commands.

cd grid_world
python main.py sim_type=safe_glm env.reuse_env=False

For the monte carlo simulation while comparing our proposed method with baselines, use the shell file, run.sh.

We also provide a script for visualization. If you want to render how the agent behaves, use the following command.

python main.py sim_type=safe_glm env.reuse_env=True

Safety-Gym Experiment

The source code necessary for our safety-gym experiment is contained in /safety_gym_discrete folder. Our experiment is based on safety-gym. Our proposed method utilize dynamic programming algorithms to solve Bellman Equation, so we modified engine.py to discrtize the environment. We attach modified safety-gym source code in /safety_gym_discrete/engine.py. To use the modified library, please clone safety-gym, then replace safety-gym/safety_gym/envs/engine.py using /safety_gym_discrete/engine.py in our repo. Using the following commands to install the modified library:

cd safety_gym
pip install -e .

Note that MuJoCo licence is needed for installing Safety-Gym. To run the simulation, use the folowing commands.

cd safety_gym_discrete
python main.py sim_idx=0

We compare our proposed method with three notable baselines: CPO, PPO-Lagrangian, and TRPO-Lagrangian. The baseline implementation depends on safety-starter-agents. We modified run_agent.py in the repo source code.

To run the baseline, use the folowing commands.

cd safety_gym_discrete/baseline
python baseline_run.py sim_type=cpo

The environment that agent runs on is generated using generate_env.py. We provide 10 50*50 environments. If you want to generate other environments, you can change the world shape in safety_gym_discrete.py, and running the following commands:

cd safety_gym_discrete
python generate_env.py

Citation

If you find this code useful in your research, please consider citing:

@inproceedings{wachi_yue_sui_neurips2021,
  Author = {Wachi, Akifumi and Wei, Yunyue and Sui, Yanan},
  Title = {Safe Policy Optimization with Local Generalized Linear Function Approximations},
  Booktitle  = {Neural Information Processing Systems (NeurIPS)},
  Year = {2021}
}
Owner
Akifumi Wachi
Akifumi Wachi
Dist2Dec: A Simplicial Neural Network for Homology Localization

Dist2Dec: A Simplicial Neural Network for Homology Localization

Alexandros Keros 6 Jun 12, 2022
The pure and clear PyTorch Distributed Training Framework.

The pure and clear PyTorch Distributed Training Framework. Introduction Requirements and Usage Dependency Dataset Basic Usage Slurm Cluster Usage Base

WILL LEE 208 Dec 20, 2022
LiDAR Distillation: Bridging the Beam-Induced Domain Gap for 3D Object Detection

LiDAR Distillation Paper | Model LiDAR Distillation: Bridging the Beam-Induced Domain Gap for 3D Object Detection Yi Wei, Zibu Wei, Yongming Rao, Jiax

Yi Wei 75 Dec 22, 2022
PyTorch implementation of ICLR 2022 paper PiCO: Contrastive Label Disambiguation for Partial Label Learning

PiCO: Contrastive Label Disambiguation for Partial Label Learning This is a PyTorch implementation of ICLR 2022 paper PiCO: Contrastive Label Disambig

王皓波 147 Jan 07, 2023
An elaborate and exhaustive paper list for Named Entity Recognition (NER)

Named-Entity-Recognition-NER-Papers by Pengfei Liu, Jinlan Fu and other contributors. An elaborate and exhaustive paper list for Named Entity Recognit

Pengfei Liu 388 Dec 18, 2022
Deep learning operations reinvented (for pytorch, tensorflow, jax and others)

This video in better quality. einops Flexible and powerful tensor operations for readable and reliable code. Supports numpy, pytorch, tensorflow, and

Alex Rogozhnikov 6.2k Jan 01, 2023
3D cascade RCNN for object detection on point cloud

3D Cascade RCNN This is the implementation of 3D Cascade RCNN: High Quality Object Detection in Point Clouds. We designed a 3D object detection model

Qi Cai 22 Dec 02, 2022
PyTorch implementation of Higher Order Recurrent Space-Time Transformer

Higher Order Recurrent Space-Time Transformer (HORST) This is the official PyTorch implementation of Higher Order Recurrent Space-Time Transformer. Th

13 Oct 18, 2022
This repository contains a toolkit for collecting, labeling and tracking object keypoints

This repository contains a toolkit for collecting, labeling and tracking object keypoints. Object keypoints are semantic points in an object's coordinate frame.

ETHZ ASL 13 Dec 12, 2022
A tool for making map images from OpenTTD save games

OpenTTD Surveyor A tool for making map images from OpenTTD save games. This is not part of the main OpenTTD codebase, nor is it ever intended to be pa

Aidan Randle-Conde 9 Feb 15, 2022
Generative Autoregressive, Normalized Flows, VAEs, Score-based models (GANVAS)

GANVAS-models This is an implementation of various generative models. It contains implementations of the following: Autoregressive Models: PixelCNN, G

MRSAIL (Mini Robotics, Software & AI Lab) 6 Nov 26, 2022
The code of paper "Block Modeling-Guided Graph Convolutional Neural Networks".

Block Modeling-Guided Graph Convolutional Neural Networks This repository contains the demo code of the paper: Block Modeling-Guided Graph Convolution

22 Dec 08, 2022
[NeurIPS 2021 Spotlight] Aligning Pretraining for Detection via Object-Level Contrastive Learning

SoCo [NeurIPS 2021 Spotlight] Aligning Pretraining for Detection via Object-Level Contrastive Learning By Fangyun Wei*, Yue Gao*, Zhirong Wu, Han Hu,

Yue Gao 139 Dec 14, 2022
A certifiable defense against adversarial examples by training neural networks to be provably robust

DiffAI v3 DiffAI is a system for training neural networks to be provably robust and for proving that they are robust. The system was developed for the

SRI Lab, ETH Zurich 202 Dec 13, 2022
This is the official Pytorch implementation of the paper "Diverse Motion Stylization for Multiple Style Domains via Spatial-Temporal Graph-Based Generative Model"

Diverse Motion Stylization (Official) This is the official Pytorch implementation of this paper. Diverse Motion Stylization for Multiple Style Domains

Soomin Park 28 Dec 16, 2022
FLVIS: Feedback Loop Based Visual Initial SLAM

FLVIS Feedback Loop Based Visual Inertial SLAM 1-Video EuRoC DataSet MH_05 Handheld Test in Lab FlVIS on UAV Platform 2-Relevent Publication: Under Re

UAV Lab - HKPolyU 182 Dec 04, 2022
A facial recognition doorbell system using a Raspberry Pi

Facial Recognition Doorbell This project expands on the person-detecting doorbell system to allow it to identify faces, and announce names accordingly

rydercalmdown 22 Apr 15, 2022
Pytorch implementation for "Implicit Semantic Response Alignment for Partial Domain Adaptation"

Implicit-Semantic-Response-Alignment Pytorch implementation for "Implicit Semantic Response Alignment for Partial Domain Adaptation" Prerequisites pyt

4 Dec 19, 2022
Benchmark for the generalization of 3D machine learning models across different remeshing/samplings of a surface.

Discretization Robust Correspondence Benchmark One challenge of machine learning on 3D surfaces is that there are many different representations/sampl

Nicholas Sharp 10 Sep 30, 2022
EfficientNetv2 TensorRT int8

EfficientNetv2_TensorRT_int8 EfficientNetv2模型实现来自https://github.com/d-li14/efficientnetv2.pytorch 环境配置 ubuntu:18.04 cuda:11.0 cudnn:8.0 tensorrt:7

34 Apr 24, 2022