Time Discretization-Invariant Safe Action Repetition for Policy Gradient Methods

Overview

Time Discretization-Invariant Safe Action Repetition for Policy Gradient Methods

This repository is the official implementation of

  • Seohong Park, Jaekyeom Kim, Gunhee Kim. Time Discretization-Invariant Safe Action Repetition for Policy Gradient Methods. In NeurIPS, 2021.

It contains the implementations for SAR, FiGAR-C and base policy gradient algorithms (PPO, TRPO and A2C).

The code is based on Stable Baselines3 (SB3) for PPO and A2C, and Stable Baselines (SB) for TRPO.

Requirements

Run examples

PPO and A2C (based on Stable Baselines3)

To install requirements:

cd sb3
pip install -r requirements.txt
pip install -e .

Train SAR-PPO on InvertedPendulum-v2 with δ = 0.01:

python repeat/main.py --env=InvertedPendulum-v2 --algo=ppo_rg --frame_skip=1 --dt=0.01 --max_t=0.05 --max_d=0.5

Train FiGAR-C-PPO on InvertedPendulum-v2 with δ = 0.01:

python repeat/main.py --env=InvertedPendulum-v2 --algo=ppo_rg --frame_skip=1 --dt=0.01 --max_t=0.05

Train PPO on InvertedPendulum-v2 with the original δ:

python repeat/main.py --env=InvertedPendulum-v2 --algo=ppo_rg

Train SAR-A2C on InvertedPendulum-v2 with δ = 0.01:

python repeat/main.py --env=InvertedPendulum-v2 --algo=a2c_rg --frame_skip=1 --dt=0.01 --max_t=0.05 --max_d=0.5

Train SAR-PPO on InvertedPendulum-v2 with δ = 0.002 and the "Action Noise" setting:

python repeat/main.py --env=InvertedPendulum-v2 --algo=ppo_rg --frame_skip=1 --dt=0.002 --max_t=0.05 --max_d=0.5 --anoise_type=action --anoise_prob=0.05 --anoise_std=3

Train SAR-PPO on InvertedPendulum-v2 with δ = 0.002 and the "External Force" setting:

python repeat/main.py --env=InvertedPendulum-v2 --algo=ppo_rg --frame_skip=1 --dt=0.002 --max_t=0.05 --max_d=0.5 --anoise_type=ext_f --anoise_prob=0.05 --anoise_std=300

Train SAR-PPO on InvertedPendulum-v2 with δ = 0.002 and the "Strong External Force (Perceptible)" setting:

python repeat/main.py --env=InvertedPendulum-v2 --algo=ppo_rg --frame_skip=1 --dt=0.002 --max_t=0.05 --max_d=0.5 --anoise_type=ext_fpc --anoise_prob=0.05 --anoise_std=1000

TRPO (based on Stable Baselines)

To install requirements:

cd sb
pip install -r requirements.txt
pip install -e .

Train SAR-TRPO on InvertedPendulum-v2 with δ = 0.01:

python repeat/main.py --env=InvertedPendulum-v2 --frame_skip=1 --dt=0.01 --max_t=0.05 --max_d=0.5

License

This codebase is licensed under the MIT License. See also sb3/LICENSE_SB3 and sb/LICENSE_SB.

Owner
Seohong Park
Seohong Park
A tool that detects the expensive Carbon Black watchlists.

A tool that detects the "expensive" Carbon Black watchlists.

Oğuzcan Pamuk 8 Aug 04, 2022
A simple linux keylogger project.

The project This project is a simple linux keylogger. When activated, it registers all the actions made with the keyboard. The log files are registere

1 Oct 24, 2021
A bitcoin private keys brute-forcing tool. Educational purpose only.

BitForce A bitcoin private keys brute-forcing tool. If you have an average computer, his will take decades to find a private key with balance. Run Mak

Gilad Leef 2 Dec 20, 2022
CloakifyFactory & the Cloakify Toolset - Data Exfiltration & Infiltration In Plain Sight;

CloakifyFactory CloakifyFactory & the Cloakify Toolset - Data Exfiltration & Infiltration In Plain Sight; Evade DLP/MLS Devices; Social Engineering of

3 Oct 18, 2022
This is a multi-password‌ cracking tool that can help you hack facebook accounts very quickly

Pro_Crack Facebook Fast Cracking Tool This is a multi-password‌ cracking tool that can help you hack facebook accounts very quickly Installation On Te

•JINN• 1 Jan 16, 2022
"KeyLogger-WebService" Is a Keylogger Write In python.

KeyLogger-WebService "KeyLogger-WebService" Is a Keylogger Write In python. When you Inject the file on a computer once the file is opened on the comp

Freddox 21 Dec 16, 2022
Password list generator for password spraying - prebaked with goodies

Generates permutations of Months, Seasons, Years, Sports Teams (NFL, NBA, MLB, NHL), Sports Scores, "Password", and even Iterable Keyspaces of a specified size.

Casey Erdmann 65 Dec 22, 2022
WinRemoteEnum is a module-based collection of operations achievable by a low-privileged domain user.

WinRemoteEnum WinRemoteEnum is a module-based collection of operations achievable by a low-privileged domain user, sharing the goal of remotely gather

Simon 9 Nov 09, 2022
An easy-to-use wrapper for NTFS-3G on macOS

ezNTFS ezNTFS is an easy-to-use wrapper for NTFS-3G on macOS. ezNTFS can be used as a menu bar app, or via the CLI in the terminal. Installation To us

Matthew Go 34 Dec 01, 2022
On-demand scanning for container registries

Lacework registry scanner Install & configure Lacework CLI Integrate a Container Registry Go to Lacework Resources Containers Container Image In

Will Robinson 1 Dec 14, 2021
聚合Github上已有的Poc或者Exp,CVE信息来自CVE官网。Auto Collect Poc Or CVE from Github by CVE ID.

PocOrExp in Github 聚合Github上已有的Poc或者Exp,CVE信息来自CVE官网 注意:只通过通用的CVE号聚合,因此对于MS17-010等Windows编号漏洞以及著名的有绰号的漏洞,还是自己检索一下比较好 Usage python3 exp.py -h usage: ex

567 Dec 30, 2022
PoC for CVE-2020-6207 (Missing Authentication Check in SAP Solution Manager)

PoC for CVE-2020-6207 (Missing Authentication Check in SAP Solution Manager) This script allows to check and exploit missing authentication checks in

chipik 82 Nov 09, 2022
Gmail Accounts Hacking

gmail-hack Gmail Accounts Hacking Gemail-Hack python script for Hack gmail account brute force What is brute force attack? In brute force attack,scrip

Aryan 25 Nov 10, 2022
Red Team Toolkit is an Open-Source Django Offensive Web-App which is keeping the useful offensive tools used in the red-teaming together.

RedTeam Toolkit Note: Only legal activities should be conducted with this project. Red Team Toolkit is an Open-Source Django Offensive Web-App contain

Mohammadreza Sarayloo 382 Jan 01, 2023
A tool for making python source difficult to read.

obscurepy Description A tool for obscuring, or making python source code difficult to read. Table of Contents Installation Limitations Usage Disclaime

Andrew Christiansen 10 Jul 31, 2022
Tools for investigating Log4j CVE-2021-44228

Log4jTools Tools for investigating Log4j CVE-2021-44228 FetchPayload.py (Get java payload from ldap path provided in JNDI lookup). Example command: Re

MalwareTech 91 Dec 29, 2022
Python & JavaScript Obfuscator made in Python 3.

Python Code Obfuscator A script that converts code into full on random numerical expressions. Simple Scripts: Python Mode... Input: Function that deco

rzx. 1 Dec 29, 2021
Orthrus is a macOS agent that uses Apple's MDM to backdoor a device using a malicious profile.

Orthrus is a macOS agent that uses Apple's MDM to backdoor a device using a malicious profile. It effectively runs its own MDM server and allows the operator to interface with it using Mythic.

Mythic Agents 37 Dec 06, 2022
This is a simple PoC for the newly found Polkit error names PwnKit

A Python3 and a BASH PoC for CVE-2021-4034 by Kim Schulz

Kim Schulz 16 Sep 06, 2022
CVE-2021-45232-RCE-多线程批量漏洞检测

CVE-2021-45232-RCE CVE-2021-45232-RCE-多线程批量漏洞检测 FOFA 查询 title="Apache APISIX Das

孤桜懶契 36 Sep 21, 2022