Sound and Cost-effective Fuzzing of Stripped Binaries by Incremental and Stochastic Rewriting

Overview

StochFuzz: A New Solution for Binary-only Fuzzing

test benchmark

loading-ag-167

StochFuzz is a (probabilistically) sound and cost-effective fuzzing technique for stripped binaries. It is facilitated by a novel incremental and stochastic rewriting technique that is particularly suitable for binary-only fuzzing. Any AFL-based fuzzer, which takes edge coverage (defined by AFL) as runtime feedback, can acquire benefits from StochFuzz to directly fuzz stripped binaries.

More data and the results of the experiments can be found here. Example cases of leveraging StochFuzz to improve advanced AFL-based fuzzers (AFL++ and Polyglot) can be found in system.md.

Clarifications

  • We adopt a new system design than the one from the paper. Details can be found at system.md.
  • In the paper, when we are talking about e9patch, we are actually talking about the binary-only fuzzing tool built upon e9patch, namely e9tool. Please refer to its website for more details.
  • StochFuzz provides sound rewriting for binaries without inlined data, and probabilistically sound rewriting for the rest.

Building StochFuzz

StochFuzz is built upon Keystone, Capstone, GLib, and libunwind.

These dependences can be built by build.sh. If you are trying to build StochFuzz in a clean container, make sure some standard tools like autoreconf and libtool are installed.

$ git clone https://github.com/ZhangZhuoSJTU/StochFuzz.git
$ cd StochFuzz
$ ./build.sh

StochFuzz itself can be built by GNU Make.

$ cd src
$ make release

We have tested StochFuzz on Ubuntu 18.04. If you have any issue when running StochFuzz on other systems, please kindly let us know.

How to Use

StochFuzz provides multiple rewriting options, which follows the AFL's style of passing arguments.

$ ./stoch-fuzz -h
stoch-fuzz 1.0.0 by <[email protected]>

./stoch-fuzz [ options ] -- target_binary [ ... ]

Mode settings:

  -S            - start a background daemon and wait for a fuzzer to attach (defualt mode)
  -R            - dry run target_binary with given arguments without an attached fuzzer
  -P            - patch target_binary without incremental rewriting
  -D            - probabilistic disassembly without rewriting
  -V            - show currently observed breakpoints

Rewriting settings:

  -g            - trace previous PC
  -c            - count the number of basic blocks with conflicting hash values
  -d            - disable instrumentation optimization
  -r            - assume the return addresses are only used by RET instructions
  -e            - install the fork server at the entrypoint instead of the main function
  -f            - forcedly assume there is data interleaving with code
  -i            - ignore the call-fallthrough edges to defense RET-misusing obfuscation

Other stuff:

  -h            - print this help
  -x execs      - set the number of executions after which a checking run will be triggered
                  set it as zero to disable checking runs (default: 200000)
  -t msec       - set the timeout for each daemon-triggering execution
                  set it as zero to ignore the timeout (default: 2000 ms)
  -l level      - set the log level, including INFO, WARN, ERROR, and FATAL (default: INFO)

Basic Usage

- It is worth first trying the advanced strategy (see below) because that is much more cost-effective.

To fuzz a stripped binary, namely example.out, we need to cd to the directory of the target binary. For example, if the full path of example.out is /root/example.out, we need to first cd /root/. Furthermore, it is dangerous to run two StochFuzz instances under the same directory. These restrictions are caused by some design faults and we will try to relax them in the future.

Assuming StochFuzz is located at /root/StochFuzz/src/stoch-fuzz, execute the following command to start rewriting the target binary.

$ cd /root/
$ /root/StochFuzz/src/stoch-fuzz -- example.out # do not use ./example.out here

After the initial rewriting, we will get a phantom file named example.out.phantom. This phantom file can be directly fuzzed by AFL or any AFL-based fuzzer. Note that the StochFuzz process would not stop during fuzzing, so please make sure the process is alive during fuzzing.

Here is a demo that shows how StochFuzz works.

asciicast

Advanced Usage

Compared with the compiler-based instrumentation (e.g., afl-clang-fast), StochFuzz has additional runtime overhead because it needs to emulate each CALL instruction to support stack unwinding.

Inspired by a recent work, we provide an advanced rewriting strategy where we do not emulate CALL instructions but wrap the _ULx86_64_step function from libunwind to support stack unwinding. This strategy works for most binaries but may fail in some cases like fuzzing statically linked binaries.

To enable such strategy, simply provide a -r option to StochFuzz.

$ cd /root/
$ /root/StochFuzz/src/stoch-fuzz -r -- example.out # do not use ./example.out here

Addtionally, before fuzzing, we need to prepare the AFL_PRELOAD environment variable for AFL.

$ export STOCHFUZZ_PRELOAD=$(/root/StochFuzz/scritps/stochfuzz_env.sh)
$ AFL_PRELOAD=$STOCHFUZZ_PRELOAD afl-fuzz -i seeds -o output -t 2000 -- example.out.phantom @@

Following demo shows how to apply this advanced strategy.

asciicast

Troubleshootings

Common issues can be referred to trouble.md. If it cannot help solve your problem, please kindly open a Github issue.

Besides, we provide some tips on using StochFuzz, which can be found at tips.md

Development

Currently, we have many todo items. We present them in todo.md.

We also present many pending decisions which we are hesitating to take, in todo.md. If you have any thought/suggestion, do not hesitate to let us know. It would be very appreciated if you can help us improve StochFuzz.

StochFuzz should be considered an alpha-quality software and it is likely to contain bugs.

I will try my best to maintain StochFuzz timely, but sometimes it may take me more time to respond. Thanks for your understanding in advance.

Cite

Zhang, Zhuo, et al. "STOCHFUZZ: Sound and Cost-effective Fuzzing of Stripped Binaries by Incremental and Stochastic Rewriting." 2021 IEEE Symposium on Security and Privacy (SP). IEEE, 2021.

References

  • Duck, Gregory J., Xiang Gao, and Abhik Roychoudhury. "Binary rewriting without control flow recovery." Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation. 2020.
  • Meng, Xiaozhu, and Weijie Liu. "Incremental CFG patching for binary rewriting." Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems. 2021.
  • Aschermann, Cornelius, et al. "Ijon: Exploring deep state spaces via fuzzing." 2020 IEEE Symposium on Security and Privacy (SP). IEEE, 2020.
  • Google. “Google/AFL.” GitHub, github.com/google/AFL.
Owner
Zhuo Zhang
Zhuo Zhang
[ICCV 2021 Oral] SnowflakeNet: Point Cloud Completion by Snowflake Point Deconvolution with Skip-Transformer

This repository contains the source code for the paper SnowflakeNet: Point Cloud Completion by Snowflake Point Deconvolution with Skip-Transformer (ICCV 2021 Oral). The project page is here.

AllenXiang 65 Dec 26, 2022
MNE: Magnetoencephalography (MEG) and Electroencephalography (EEG) in Python

MNE-Python MNE-Python software is an open-source Python package for exploring, visualizing, and analyzing human neurophysiological data such as MEG, E

MNE tools for MEG and EEG data analysis 2.1k Dec 28, 2022
[CVPR 2021] "The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models" Tianlong Chen, Jonathan Frankle, Shiyu Chang, Sijia Liu, Yang Zhang, Michael Carbin, Zhangyang Wang

The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models Codes for this paper The Lottery Tickets Hypo

VITA 59 Dec 28, 2022
Unrestricted Facial Geometry Reconstruction Using Image-to-Image Translation

Unrestricted Facial Geometry Reconstruction Using Image-to-Image Translation [Arxiv] [Video] Evaluation code for Unrestricted Facial Geometry Reconstr

Matan Sela 242 Dec 30, 2022
PyG (PyTorch Geometric) - A library built upon PyTorch to easily write and train Graph Neural Networks (GNNs)

PyG (PyTorch Geometric) is a library built upon PyTorch to easily write and train Graph Neural Networks (GNNs) for a wide range of applications related to structured data.

PyG 16.5k Jan 08, 2023
classification task on dataset-CIFAR10,by using Tensorflow/keras

CIFAR10-Tensorflow classification task on dataset-CIFAR10,by using Tensorflow/keras 在这一个库中,我使用Tensorflow与keras框架搭建了几个卷积神经网络模型,针对CIFAR10数据集进行了训练与测试。分别使

3 Oct 17, 2021
Self-Supervised Collision Handling via Generative 3D Garment Models for Virtual Try-On

Self-Supervised Collision Handling via Generative 3D Garment Models for Virtual Try-On [Project website] [Dataset] [Video] Abstract We propose a new g

71 Dec 24, 2022
This program was designed to detect whether someone is wearing a facemask through a live video stream.

This program was designed to detect whether someone is wearing a facemask through a live video stream. A custom lightweight CNN trained with TensorFlow on a public dataset provided by Kaggle is used

0 Apr 02, 2022
A FAIR dataset of TCV experimental results for validating edge/divertor turbulence models.

TCV-X21 validation for divertor turbulence simulations Quick links Intro Welcome to TCV-X21. We're glad you've found us! This repository is designed t

0 Dec 18, 2021
CVPR 2021 - Official code repository for the paper: On Self-Contact and Human Pose.

TUCH This repo is part of our project: On Self-Contact and Human Pose. [Project Page] [Paper] [MPI Project Page] License Software Copyright License fo

Lea Müller 45 Jan 07, 2023
Bootstrapped Representation Learning on Graphs

Bootstrapped Representation Learning on Graphs This is the PyTorch implementation of BGRL Bootstrapped Representation Learning on Graphs The main scri

NerDS Lab :: Neural Data Science Lab 55 Jan 07, 2023
Projecting interval uncertainty through the discrete Fourier transform

Projecting interval uncertainty through the discrete Fourier transform This repo

1 Mar 02, 2022
AirLoop: Lifelong Loop Closure Detection

AirLoop This repo contains the source code for paper: Dasong Gao, Chen Wang, Sebastian Scherer. "AirLoop: Lifelong Loop Closure Detection." arXiv prep

Chen Wang 53 Jan 03, 2023
Reverse engineering recurrent neural networks with Jacobian switching linear dynamical systems

Reverse engineering recurrent neural networks with Jacobian switching linear dynamical systems This repository is the official implementation of Rever

6 Aug 25, 2022
BookMyShowPC - Movie Ticket Reservation App made with Tkinter

Book My Show PC What is this? Movie Ticket Reservation App made with Tkinter. Tk

The Nithin Balaji 3 Dec 09, 2022
Self-driving car env with PPO algorithm from stable baseline3

Self-driving car with RL stable baseline3 Most of the project develop from https://github.com/GerardMaggiolino/Gym-Medium-Post Please check it out! Th

Sornsiri.P 7 Dec 22, 2022
VGGFace2-HQ - A high resolution face dataset for face editing purpose

The first open source high resolution dataset for face swapping!!! A high resolution version of VGGFace2 for academic face editing purpose

Naiyuan Liu 232 Dec 29, 2022
Simple implementation of Mobile-Former on Pytorch

Simple-implementation-of-Mobile-Former At present, only the model but no trained. There may be some bug in the code, and some details may be different

Acheung 103 Dec 31, 2022
PyTorch implementation of 'Gen-LaneNet: a generalized and scalable approach for 3D lane detection'

(pytorch) Gen-LaneNet: a generalized and scalable approach for 3D lane detection Introduction This is a pytorch implementation of Gen-LaneNet, which p

Yuliang Guo 233 Jan 06, 2023
验证码识别 深度学习 tensorflow 神经网络

captcha_tf2 验证码识别 深度学习 tensorflow 神经网络 使用卷积神经网络,对字符,数字类型验证码进行识别,tensorflow使用2.0以上 目前项目还在更新中,诸多bug,欢迎提出issue和PR, 希望和你一起共同完善项目。 实例demo 训练过程 优化器选择: Adam

5 Apr 28, 2022