Safe Policy Optimization with Local Features

Overview

Safe Policy Optimization with Local Feature (SPO-LF)

This is the source-code for implementing the algorithms in the paper "Safe Policy Optimization with Local Generalized Linear Function Approximations" which was presented in NeurIPS-21.

Installation

There is requirements.txt in this repository. Except for the common modules (e.g., numpy, scipy), our source code depends on the following modules.

We also provide Dockerfile in this repository, which can be used for reproducing our grid-world experiment.

Simulation configuration

We manage the simulation configuration using hydra. Configurations are listed in config.yaml. For example, the algorithm to run should be chosen from the ones we implemented:

sim_type: {safe_glm, unsafe_glm, random, oracle, safe_gp_state, safe_gp_feature, safe_glm_stepwise}

Grid World Experiment

The source code necessary for our grid-world experiment is contained in /grid_world folder. To run the simulation, for example, use the following commands.

cd grid_world
python main.py sim_type=safe_glm env.reuse_env=False

For the monte carlo simulation while comparing our proposed method with baselines, use the shell file, run.sh.

We also provide a script for visualization. If you want to render how the agent behaves, use the following command.

python main.py sim_type=safe_glm env.reuse_env=True

Safety-Gym Experiment

The source code necessary for our safety-gym experiment is contained in /safety_gym_discrete folder. Our experiment is based on safety-gym. Our proposed method utilize dynamic programming algorithms to solve Bellman Equation, so we modified engine.py to discrtize the environment. We attach modified safety-gym source code in /safety_gym_discrete/engine.py. To use the modified library, please clone safety-gym, then replace safety-gym/safety_gym/envs/engine.py using /safety_gym_discrete/engine.py in our repo. Using the following commands to install the modified library:

cd safety_gym
pip install -e .

Note that MuJoCo licence is needed for installing Safety-Gym. To run the simulation, use the folowing commands.

cd safety_gym_discrete
python main.py sim_idx=0

We compare our proposed method with three notable baselines: CPO, PPO-Lagrangian, and TRPO-Lagrangian. The baseline implementation depends on safety-starter-agents. We modified run_agent.py in the repo source code.

To run the baseline, use the folowing commands.

cd safety_gym_discrete/baseline
python baseline_run.py sim_type=cpo

The environment that agent runs on is generated using generate_env.py. We provide 10 50*50 environments. If you want to generate other environments, you can change the world shape in safety_gym_discrete.py, and running the following commands:

cd safety_gym_discrete
python generate_env.py

Citation

If you find this code useful in your research, please consider citing:

@inproceedings{wachi_yue_sui_neurips2021,
  Author = {Wachi, Akifumi and Wei, Yunyue and Sui, Yanan},
  Title = {Safe Policy Optimization with Local Generalized Linear Function Approximations},
  Booktitle  = {Neural Information Processing Systems (NeurIPS)},
  Year = {2021}
}
Owner
Akifumi Wachi
Akifumi Wachi
Dapunta Multi Brute Force Facebook - Crack Facebook With Login - Free

✭ DMBF CRACK Dibuat Dengan ❤️ Oleh Dapunta Author: - Dapunta Khurayra X ⇨ Fitur Login [✯] Login Token ⇨ Fitur Crack [✯] Crack Dari Teman, Public,

Dapunta ID 10 Oct 19, 2022
(D)arth (S)ide of the (L)og4j (F)orce, the ultimate log4j vulnerabilities assessor

DSLF DSLF stands for (D)arth (S)ide of the (L)og4j (F)orce. It is the ultimate log4j vulnerabilities assessor. It comes with four individual Python3 m

frontal 1 Jan 11, 2022
Proof of Concept Exploit for vCenter CVE-2021-21972

CVE-2021-21972 Proof of Concept Exploit for vCenter CVE-2021-21972

Horizon 3 AI Inc 210 Dec 31, 2022
Scout Suite - an open source multi-cloud security-auditing tool,

Description Scout Suite is an open source multi-cloud security-auditing tool, which enables security posture assessment of cloud environments. Using t

NCC Group Plc 5k Jan 05, 2023
Local server for IDA Lumina feature

About POC of an offline server for IDA Lumina feature.

Synacktiv 166 Dec 30, 2022
DepFine Is a tool to find the unregistered dependency based on dependency confusion valunerablility and lead to RCE

DepFine DepFine Is a tool to find the unregistered dependency based on dependency confusion valunerablility and lead to RCE Installation: You Can inst

Hossam mesbah 14 Nov 11, 2022
Just your basic port scanner - with multiprocessing capabilities & further nmap enumeration.

Just-Your-Basic-Port-Scanner Just your basic port scanner - with multiprocessing capabilities & further nmap enumeration. Use at your own discretion,

Edward Zhou 0 Nov 06, 2021
A brute force tool for password-protected zip file

Bzip A brute force tool for password-protected zip file/folder(s). Note that this tool can only crack .zip files. Please DO not misuse. Installation g

3 Nov 13, 2021
To explore creating an application that detects available connections at once from wifi and bluetooth

Signalum A Linux Package to detect and analyze existing connections from wifi and bluetooth. Also checkout the Desktop Application. Signalum Installat

BISOHNS 56 Mar 03, 2021
A Python application to predict what is cooking

ez-cuisine-classifier A Python application to predict what is cooking Environment Python 3.9 Windows 10 Install python -m venv venv .\venv\Scripts\act

Zeheng Li 1 Jun 21, 2022
A hack for writing switch statements with type annotations in Python.

py_annotation_switch A hack for writing switch statements in type annotations for Python. Why should I use this? You most definitely should not use th

6 Oct 17, 2021
Open Source Intelligence gathering tool aimed at reducing the time spent harvesting information from open sources.

The Recon-ng Framework Recon-ng content now available on Pluralsight! Recon-ng is a full-featured reconnaissance framework designed with the goal of p

2.4k Jan 07, 2023
Colin O'Flynn's Hacakday talk at Remoticon 2021 support repo.

Hardware Hacking Resources This repo holds some of the examples used in Colin's Hardware Hacking talk at Remoticon 2021. You can see the very sketchy

Colin O'Flynn 19 Sep 12, 2022
Searches filesystem for CVE-2021-44228 and CVE-2021-45046 vulnerable instances of log4j library, including embedded (jar/war/zip) packaged ones.

log4shell_finder Python port of https://github.com/mergebase/log4j-detector log4j-detector is copyright (c) 2021 - MergeBase Software Inc. https://mer

Hynek Petrak 33 Jan 04, 2023
Generate malicious files using recently published bidi-attack (CVE-2021-42574)

CVE-2021-42574 - Code generator Generate malicious files using recently published bidi-attack vulnerability, which was discovered in Unicode Specifica

js-on 7 Nov 09, 2022
CSAW 2021 writeups

CSAW 2021 Writeups Challenge Category Solved by The Magic Modbus ics root2thrill Lazy Leaks Forensics root2thrill Poem Collection warm-up root2thrill

7 Oct 06, 2021
Consolidating and extending hosts files from several well-curated sources. You can optionally pick extensions to block pornography, social media, and other categories.

Take Note! With the exception of issues and PRs regarding changes to hosts/data/StevenBlack/hosts, all other issues regarding the content of the produ

Steven Black 22.1k Jan 02, 2023
一款针对向日葵的识别码和验证码提取工具

Sunflower_get_Password 一款针对向日葵的识别码和验证码提取工具 👮🏻‍♀️ 免责声明 由于传播、利用Sunflower_get_Password工具提供的功能而造成的任何直接或者间接的后果及损失,均由使用者本人负责,本人不为此承担任何责任。 安装环境 本工具使用Python

635 Dec 20, 2022
Log4Shell Proof of Concept (CVE-2021-44228)

CVE-2021-44228 Log4Shell Proof of Concept (CVE-2021-44228) Make sure to use Java 8 JDK. Java 8 Download Images Credits Casey Dunham - Java Reverse She

Kr0ff 3 Jul 23, 2022
Simple Dos-Attacker.

dos-attacker ❕ Atenção Não ataque sites privados. isto é illegal. 🖥️ Pré-requisitos Ultima versão do Python3. para verificar isto, é bem simples. Bas

Dio brando 10 Apr 15, 2022