Repository containing the PhD Thesis "Formal Verification of Deep Reinforcement Learning Agents"

Related tags

Deep LearningSafeDRL
Overview

Getting Started

This repository contains the code used for the following publications:

  • Probabilistic Guarantees for Safe Deep Reinforcement Learning (FORMATS 2020)
  • Verifying Reinforcement Learning up to Infinity (IJCAI 2021)
  • Verified Probabilistic Policies for Deep Reinforcement Learning (NFM 2022)

These instructions will help with setting up the project

Prerequisites

Create a virtual environment with conda:

conda env create -f environment.yml
conda activate safedrl

This will take care of installing all the dependencies needed by python

In addition, download PRISM from the following link: https://github.com/phate09/prism

Ensure you have Gradle installed (https://gradle.org/install/)

Running the code

Before running any code, in a new terminal go to the PRISM project folder and run

gradle run

This will enable the communication channel between PRISM and the rest of the repository

Probabilistic Guarantees for Safe Deep Reinforcement Learning (FORMATS 2020)

Training

Run the train_pendulum.py inside agents/dqn to train the agent on the inverted pendulum problem and record the location of the saved agent

Analysis

Run the domain_analysis_sym.py inside runnables/symbolic/dqn changing paths to point to the saved network

Verifying Reinforcement Learning up to Infinity (IJCAI 2021)

####Paper results ## download and unzip experiment_collection_final.zip in the 'save' directory

run tensorboard --logdir=./save/experiment_collection_final

(results for the output range analysis experiments are in experiment_collection_ora_final.zip)

####Train neural networks from scratch ## run either:

  • training/tune_train_PPO_bouncing_ball.py
  • training/tune_train_PPO_car.py
  • training/tune_train_PPO_cartpole.py

####Check safety of pretrained agents ## download and unzip pretrained_agents.zip in the 'save' directory

run verification/run_tune_experiments.py

(to monitor the progress of the algorithm run tensorboard --logdir=./save/experiment_collection_final)

The results in tensorboard can be filtered using regular expressions (eg. "bouncing_ball.* template: 0") on the search bar on the left:

The name of the experiment contains the name of the problem (bouncing_ball, cartpole, stopping car), the amount of adversarial noise ("eps", only for stopping_car), the time steps length for the dynamics of the system ("tau", only for cartpole) and the choice of restriction in order of complexity (0 being box, 1 being the chosen template, and 2 being octagon).

The table in the paper is filled by using some of the metrics reported in tensorboard:

  • max_t: Avg timesteps
  • seen: Avg polyhedra
  • time_since_restore: Avg clock time (s)

alt text

Verified Probabilistic Policies for Deep Reinforcement Learning (NFM 2022)

Owner
Edoardo Bacci
Edoardo Bacci
Differentiable Simulation of Soft Multi-body Systems

Differentiable Simulation of Soft Multi-body Systems Yi-Ling Qiao, Junbang Liang, Vladlen Koltun, Ming C. Lin [Paper] [Code] Updates The C++ backend s

YilingQiao 26 Dec 23, 2022
A pytorch reproduction of { Co-occurrence Feature Learning from Skeleton Data for Action Recognition and Detection with Hierarchical Aggregation }.

A PyTorch Reproduction of HCN Co-occurrence Feature Learning from Skeleton Data for Action Recognition and Detection with Hierarchical Aggregation. Ch

Guyue Hu 210 Dec 31, 2022
PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech Recognition" (INTERSPEECH 2020)

PyTorch implementation of Conformer: Convolution-augmented Transformer for Speech Recognition. Transformer models are good at capturing content-based

Soohwan Kim 565 Jan 04, 2023
retweet 4 satoshi ⚡️

rt4sat retweet 4 satoshi This bot is the codebase for https://twitter.com/rt4sat please feel free to create an issue if you saw any bugs basically thi

6 Sep 30, 2022
Audio-Visual Generalized Few-Shot Learning with Prototype-Based Co-Adaptation

Audio-Visual Generalized Few-Shot Learning with Prototype-Based Co-Adaptation The code repository for "Audio-Visual Generalized Few-Shot Learning with

Kaiaicy 3 Jun 27, 2022
Security evaluation module with onnx, pytorch, and SecML.

🚀 🐼 🔥 PandaVision Integrate and automate security evaluations with onnx, pytorch, and SecML! Installation Starting the server without Docker If you

Maura Pintor 11 Apr 12, 2022
Code for "Long-tailed Distribution Adaptation"

Long-tailed Distribution Adaptation (Accepted in ACM MM2021) This project is built upon BBN. Installation pip install -r requirements.txt Usage Traini

Zhiliang Peng 10 May 18, 2022
dataset for ECCV 2020 "Motion Capture from Internet Videos"

Motion Capture from Internet Videos Motion Capture from Internet Videos Junting Dong*, Qing Shuai*, Yuanqing Zhang, Xian Liu, Xiaowei Zhou, Hujun Bao

ZJU3DV 98 Dec 07, 2022
All the code and files related to the MI-Lab of UE19CS305 course in sem 5

Machine-Intelligence-Lab-CS305 The compilation of all the code an drelated files from MI-Lab UE19CS305 (of batch 2019-2023) offered by PES University

Arvind Krishna 3 Nov 10, 2022
frida工具的缝合怪

fridaUiTools fridaUiTools是一个界面化整理脚本的工具。新人的练手作品。参考项目ZenTracer,觉得既然可以界面化,那么应该可以把功能做的更加完善一些。跨平台支持:win、mac、linux 功能缝合怪。把一些常用的frida的hook脚本简单统一输出方式后,整合进来。并且

diveking 997 Jan 09, 2023
Train an imgs.ai model on your own dataset

imgs.ai is a fast, dataset-agnostic, deep visual search engine for digital art history based on neural network embeddings.

Fabian Offert 5 Dec 21, 2021
pytorch implementation of "Contrastive Multiview Coding", "Momentum Contrast for Unsupervised Visual Representation Learning", and "Unsupervised Feature Learning via Non-Parametric Instance-level Discrimination"

Unofficial implementation: MoCo: Momentum Contrast for Unsupervised Visual Representation Learning (Paper) InsDis: Unsupervised Feature Learning via N

Zhiqiang Shen 16 Nov 04, 2020
PECOS - Prediction for Enormous and Correlated Spaces

PECOS - Predictions for Enormous and Correlated Output Spaces PECOS is a versatile and modular machine learning (ML) framework for fast learning and i

Amazon 387 Jan 04, 2023
Pytorch implementation of Learning with Opponent-Learning Awareness

Pytorch implementation of Learning with Opponent-Learning Awareness using DiCE

Alexis David Jacq 82 Sep 15, 2022
Gym for multi-agent reinforcement learning

PettingZoo is a Python library for conducting research in multi-agent reinforcement learning, akin to a multi-agent version of Gym. Our website, with

Farama Foundation 1.6k Jan 09, 2023
Sparse R-CNN: End-to-End Object Detection with Learnable Proposals, CVPR2021

End-to-End Object Detection with Learnable Proposal, CVPR2021

Peize Sun 1.2k Dec 27, 2022
DECAF: Deep Extreme Classification with Label Features

DECAF DECAF: Deep Extreme Classification with Label Features @InProceedings{Mittal21, author = "Mittal, A. and Dahiya, K. and Agrawal, S. and Sain

46 Nov 06, 2022
Add gui for YoloV5 using PyQt5

HEAD 更新2021.08.16 **添加图片和视频保存功能: 1.图片和视频按照当前系统时间进行命名 2.各自检测结果存放入output文件夹 3.摄像头检测的默认设备序号更改为0,减少调试报错 温馨提示: 1.项目放置在全英文路径下,防止项目报错 2.默认使用cpu进行检测,自

Ruihao Wang 65 Dec 27, 2022
GLM (General Language Model)

GLM GLM is a General Language Model pretrained with an autoregressive blank-filling objective and can be finetuned on various natural language underst

THUDM 421 Jan 04, 2023
YOLOv5 🚀 is a family of object detection architectures and models pretrained on the COCO dataset

YOLOv5 🚀 is a family of object detection architectures and models pretrained on the COCO dataset, and represents Ultralytics open-source research int

阿才 73 Dec 16, 2022