Malware Env for OpenAI Gym

Overview

Malware Env for OpenAI Gym


Citing

If you use this code in a publication please cite the following paper:


Hyrum S. Anderson, Anant Kharkar, Bobby Filar, David Evans, Phil Roth, "Learning to Evade Static PE Machine Learning Malware Models via Reinforcement Learning", in ArXiv e-prints. Jan. 2018.

@ARTICLE{anderson2018learning,
  author={Anderson, Hyrum S and Kharkar, Anant and Filar, Bobby and Evans, David and Roth, Phil},
  title={Learning to Evade Static PE Machine Learning Malware Models via Reinforcement Learning},
  journal={arXiv preprint arXiv:1801.08917},
  archivePrefix = "arXiv",
  eprint = {1801.08917},
  primaryClass = "cs.CR",
  keywords = {Computer Science - Cryptography and Security},
  year = 2018,
  month = jan,
  adsurl = {http://adsabs.harvard.edu/abs/2018arXiv180108917A},
}

This is a malware manipulation environment for OpenAI's gym. OpenAI Gym is a toolkit for developing and comparing reinforcement learning algorithms. This makes it possible to write agents that learn to manipulate PE files (e.g., malware) to achieve some objective (e.g., bypass AV) based on a reward provided by taking specific manipulation actions.

Objective

Create an AI that learns through reinforcement learning which functionality-preserving transformations to make on a malware sample to break through / bypass machine learning static-analysis malware detection.

Breakout

Basics

There are two basic concepts in reinforcement learning: the environment (in our case, the malware sample) and the agent (namely, the algorithm used to change the environment). The agent sends actions to the environment, and the environment replies with observations and rewards (that is, a score).

This repo provides an environment for manipulating PE files and providing rewards that are based around bypassing AV. An agent can be deployed that have already been written for the rich gym framework. For example

Setup

The EvadeRL framework is built on Python3.6 we recommend first creating a virtualenv (details can be found here) with Python3.6 then performing the following actions ensure you have the correct python libraries:

pip install -r requirements.txt

EvadeRL also leverages a Library to Instrument Executable Formats aptly named LIEF. It allows our agent to modify the binary on-the-fly. To add it to your virtualenv just pip install one of their pre-built packages. Examples below:

Linux

pip install https://github.com/lief-project/LIEF/releases/download/0.7.0/linux_lief-0.7.0_py3.6.tar.gz

OSX

pip install https://github.com/lief-project/LIEF/releases/download/0.7.0/osx_lief-0.7.0_py3.6.tar.gz

Once completed ensure you've moved malware samples into the

gym_malware/gym_malware/envs/utils/samples/

If you are unsure where to acquire malware samples see the Data Acquisition section below. If you have samples in the correct directory you can check to see if your environment is correctly setup by running :

python test_agent_chainer.py

Note that if you are using Anaconda, you may need to

conda install libgcc

in order for LIEF to operate properly.

Data Acquisition

If you have a VirusTotal API key, you may download samples to the gym_malware/gym_malware/envs/utils/samples/ using the Python script download_samples.py.

Gym-Malware Environment

EvadeRL pits a reinforcement agent against the malware environment consisting of the following components:

  • Action Space
  • Independent Malware Classifier
  • OpenAI framework malware environment (aka gym-malware)

Action Space

The moves or actions that can be performed on a malware sample in our environment consist of the following binary manipulations:

  • append_zero
  • append_random_ascii
  • append_random_bytes
  • remove_signature
  • upx_pack
  • upx_unpack
  • change_section_names_from_list
  • change_section_names_to random
  • modify_export
  • remove_debug
  • break_optional_header_checksum

The agent will randomly select these actions in an attempt to bypass the classifier (info on default classifier below). Over time, the agent learns which combinations lead to the highest rewards, or learns a policy (like an optimal plan of attack for any given observation).

Independent Classifier

Included as a default model is a gradient boosted decision trees model trained on 50k malicious and 50k benign samples with the following features extracted:

  • Byte-level data (e.g. histogram and entropy)
  • Header
  • Section
  • Import/Exports
Owner
ENDGAME
ENDGAME
A Framework for Encrypted Machine Learning in TensorFlow

TF Encrypted is a framework for encrypted machine learning in TensorFlow. It looks and feels like TensorFlow, taking advantage of the ease-of-use of t

TF Encrypted 0 Jul 06, 2022
Code for IntraQ, PyTorch implementation of our paper under review

IntraQ: Learning Synthetic Images with Intra-Class Heterogeneity for Zero-Shot Network Quantization paper Requirements Python = 3.7.10 Pytorch == 1.7

1 Nov 19, 2021
HCQ: Hybrid Contrastive Quantization for Efficient Cross-View Video Retrieval

HCQ: Hybrid Contrastive Quantization for Efficient Cross-View Video Retrieval [toc] 1. Introduction This repository provides the code for our paper at

13 Dec 08, 2022
An introduction to satellite image analysis using Python + OpenCV and JavaScript + Google Earth Engine

A Gentle Introduction to Satellite Image Processing Welcome to this introductory course on Satellite Image Analysis! Satellite imagery has become a pr

Edward Oughton 32 Jan 03, 2023
Realtime YOLO Monster Detection With Non Maximum Supression

Realtime-YOLO-Monster-Detection-With-Non-Maximum-Supression Table of Contents In

5 Oct 07, 2022
Flexible-Modal Face Anti-Spoofing: A Benchmark

Flexible-Modal FAS This is the official repository of "Flexible-Modal Face Anti-

Zitong Yu 22 Nov 10, 2022
[CVPR2021] UAV-Human: A Large Benchmark for Human Behavior Understanding with Unmanned Aerial Vehicles

UAV-Human Official repository for CVPR2021: UAV-Human: A Large Benchmark for Human Behavior Understanding with Unmanned Aerial Vehicle Paper arXiv Res

129 Jan 04, 2023
免费获取http代理并生成proxifier配置文件

freeproxy 免费获取http代理并生成proxifier配置文件 公众号:台下言书 工具说明:https://mp.weixin.qq.com/s?__biz=MzIyNDkwNjQ5Ng==&mid=2247484425&idx=1&sn=56ccbe130822aa35038095317

说书人 32 Mar 25, 2022
GeneDisco is a benchmark suite for evaluating active learning algorithms for experimental design in drug discovery.

GeneDisco is a benchmark suite for evaluating active learning algorithms for experimental design in drug discovery.

22 Dec 12, 2022
Official PyTorch implementation of the ICRA 2021 paper: Adversarial Differentiable Data Augmentation for Autonomous Systems.

Adversarial Differentiable Data Augmentation This repository provides the official PyTorch implementation of the ICRA 2021 paper: Adversarial Differen

Manli 3 Oct 15, 2022
Hierarchical User Intent Graph Network for Multimedia Recommendation

Hierarchical User Intent Graph Network for Multimedia Recommendation This is our Pytorch implementation for the paper: Hierarchical User Intent Graph

6 Jan 05, 2023
Generic ecosystem for feature extraction from aerial and satellite imagery

Note: Robosat is neither maintained not actively developed any longer by Mapbox. See this issue. The main developers (@daniel-j-h, @bkowshik) are no l

Mapbox 1.9k Jan 06, 2023
[TIP2020] Adaptive Graph Representation Learning for Video Person Re-identification

Introduction This is the PyTorch implementation for Adaptive Graph Representation Learning for Video Person Re-identification. Get started git clone h

WuYiming 41 Dec 12, 2022
ManipulaTHOR, a framework that facilitates visual manipulation of objects using a robotic arm

ManipulaTHOR: A Framework for Visual Object Manipulation Kiana Ehsani, Winson Han, Alvaro Herrasti, Eli VanderBilt, Luca Weihs, Eric Kolve, Aniruddha

AI2 65 Dec 30, 2022
GPOEO is a micro-intrusive GPU online energy optimization framework for iterative applications

GPOEO GPOEO is a micro-intrusive GPU online energy optimization framework for iterative applications. We also implement ODPP [1] as a comparison. [1]

瑞雪轻飏 8 Sep 10, 2022
Picasso: a methods for embedding points in 2D in a way that respects distances while fitting a user-specified shape.

Picasso Code to generate Picasso embeddings of any input matrix. Picasso maps the points of an input matrix to user-defined, n-dimensional shape coord

Pachter Lab 45 Dec 23, 2022
PyTorch implementation of some learning rate schedulers for deep learning researcher.

pytorch-lr-scheduler PyTorch implementation of some learning rate schedulers for deep learning researcher. Usage WarmupReduceLROnPlateauScheduler Visu

Soohwan Kim 59 Dec 08, 2022
Implementation of the paper "Self-Promoted Prototype Refinement for Few-Shot Class-Incremental Learning"

Self-Promoted Prototype Refinement for Few-Shot Class-Incremental Learning This is the implementation of the paper "Self-Promoted Prototype Refinement

Kai Zhu 78 Dec 02, 2022
The Unsupervised Reinforcement Learning Benchmark (URLB)

The Unsupervised Reinforcement Learning Benchmark (URLB) URLB provides a set of leading algorithms for unsupervised reinforcement learning where agent

259 Dec 26, 2022