ProFuzzBench - A Benchmark for Stateful Protocol Fuzzing

Overview

ProFuzzBench - A Benchmark for Stateful Protocol Fuzzing

ProFuzzBench is a benchmark for stateful fuzzing of network protocols. It includes a suite of representative open-source network servers for popular protocols (e.g., TLS, SSH, SMTP, FTP, SIP), and tools to automate experimentation.

Citing ProFuzzBench

ProFuzzBench has been accepted for publication as a Tool Demonstrations paper at the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA) 2021.

@inproceedings{profuzzbench,
  title={ProFuzzBench: A Benchmark for Stateful Protocol Fuzzing},
  author={Roberto Natella and Van-Thuan Pham},
  booktitle={Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis},
  year={2021}
}

Folder structure

protocol-fuzzing-benchmark
├── subjects: this folder contains all protocols included in this benchmark and
│   │         each protocol may have more than one target server
│   └── RTSP
│   └── FTP
│   │   └── LightFTP
│   │       └── Dockerfile: subject-specific Dockerfile
│   │       └── run.sh: (subject-specific) main script to run experiment inside a container
│   │       └── cov_script.sh: (subject-specific) script to do code coverage analysis
│   │       └── other files (e.g., patches, other subject-specific scripts)
│   └── ...
└── scripts: this folder contains all scripts to run experiments, collect & analyze results
│   └── execution
│   │   └── profuzzbench_exec_common.sh: main script to spawn containers and run experiments on them
│   │   └── ...
│   └── analysis
│       └── profuzzbench_generate_csv.sh: this script collect code coverage results from different runs
│       └── profuzzbench_plot.py: sample script for plotting
└── README.md

Tutorial - Fuzzing LightFTP server with AFLNet and AFLnwe, a network-enabled version of AFL

Follow the steps below to run and collect experimental results for LightFTP, which is a lightweight File Transfer Protocol (FTP) server. The similar steps should be followed to run experiments on other subjects. Each subject program comes with a README.md file showing subject-specific commands to run experiments.

Step-0. Set up environmental variables

git clone https://github.com/profuzzbench/profuzzbench.git
cd profuzzbench
export PFBENCH=$(pwd)
export PATH=$PATH:$PFBENCH/scripts/execution:$PFBENCH/scripts/analysis

Step-1. Build a docker image

The following commands create a docker image tagged lightftp. The image should have everything available for fuzzing and code coverage collection.

cd $PFBENCH
cd subjects/FTP/LightFTP
docker build . -t lightftp

Step-2. Run fuzzing

Run profuzzbench_exec_common.sh script to start an experiment. The script takes 8 arguments as listed below.

  • 1st argument (DOCIMAGE) : name of the docker image
  • 2nd argument (RUNS) : number of runs, one isolated Docker container is spawned for each run
  • 3rd argument (SAVETO) : path to a folder keeping the results
  • 4th argument (FUZZER) : fuzzer name (e.g., aflnet) -- this name must match the name of the fuzzer folder inside the Docker container (e.g., /home/ubuntu/aflnet)
  • 5th argument (OUTDIR) : name of the output folder created inside the docker container
  • 6th argument (OPTIONS) : all options needed for fuzzing in addition to the standard options written in the target-specific run.sh script
  • 7th argument (TIMEOUT) : time for fuzzing in seconds
  • 8th argument (SKIPCOUNT): used for calculating coverage over time. e.g., SKIPCOUNT=5 means we run gcovr after every 5 test cases because gcovr takes time and we do not want to run it after every single test case

The following commands run 4 instances of AFLNet and 4 instances of AFLnwe to simultaenously fuzz LightFTP in 60 minutes.

cd $PFBENCH
mkdir results-lightftp

profuzzbench_exec_common.sh lightftp 4 results-lightftp aflnet out-lightftp-aflnet "-P FTP -D 10000 -q 3 -s 3 -E -K -c ./ftpclean.sh" 3600 5 &
profuzzbench_exec_common.sh lightftp 4 results-lightftp aflnwe out-lightftp-aflnwe "-D 10000 -K -c ./ftpclean.sh" 3600 5

If the script runs successfully, its output should look similar to the text below.

AFLNET: Fuzzing in progress ...
AFLNET: Waiting for the following containers to stop:  f2da4b72b002 b7421386b288 cebbbc741f93 5c54104ddb86
AFLNET: Collecting results and save them to results-lightftp
AFLNET: Collecting results from container f2da4b72b002
AFLNET: Collecting results from container b7421386b288
AFLNET: Collecting results from container cebbbc741f93
AFLNET: Collecting results from container 5c54104ddb86
AFLNET: I am done!

Step-3. Collect the results

All results (in tar files) should be stored in the folder created in Step-2 (results-lightftp). Specifically, these tar files are the compressed version of output folders produced by all fuzzing instances. If the fuzzer is afl based (e.g., AFLNet, AFLnwe) each folder should contain sub-folders like crashes, hangs, queue and so on. Use profuzzbench_generate_csv.sh script to collect results in terms of code coverage over time. The script takes 5 arguments as listed below.

  • 1st argument (PROG) : name of the subject program (e.g., lightftp)
  • 2nd argument (RUNS) : number of runs
  • 3rd argument (FUZZER) : fuzzer name (e.g., aflnet)
  • 4th argument (COVFILE): CSV-formatted output file keeping the results
  • 5th argument (APPEND) : append mode; set this to 0 for the first fuzzer and 1 for the subsequent fuzzer(s).

The following commands collect the code coverage results produced by AFLNet and AFLnwe and save them to results.csv.

cd $PFBENCH/results-lightftp

profuzzbench_generate_csv.sh lightftp 4 aflnet results.csv 0
profuzzbench_generate_csv.sh lightftp 4 aflnwe results.csv 1

The results.csv file should look similar to text below. The file has six columns showing the timestamp, subject program, fuzzer name, run index, coverage type and its value. The file contains both line coverage and branch coverage over time information. Each coverage type comes with two values, in percentage (_per) and in absolute number (_abs).

time,subject,fuzzer,run,cov_type,cov
1600905795,lightftp,aflnwe,1,l_per,25.9
1600905795,lightftp,aflnwe,1,l_abs,292
1600905795,lightftp,aflnwe,1,b_per,13.3
1600905795,lightftp,aflnwe,1,b_abs,108
1600905795,lightftp,aflnwe,1,l_per,25.9
1600905795,lightftp,aflnwe,1,l_abs,292

Step-4. Analyze the results

The results collected in step 3 (i.e., results.csv) can be used for plotting. For instance, we provide a sample Python script to plot code coverage over time. Use the following command to plot the results and save it to a file.

cd $PFBENCH/results-lightftp

profuzzbench_plot.py -i results.csv -p lightftp -r 4 -c 60 -s 1 -o cov_over_time.png

This is a sample code coverage report generated by the script. Sample report

Parallel builds

To speed-up the build of Docker images, you can pass the option "-j" to make, using the MAKE_OPT environment variable and the --build-arg option of docker build. Example:

export MAKE_OPT="-j4"
docker build . -t lightftp --build-arg MAKE_OPT

FAQs

1. How do I extend ProFuzzBench?

If you want to add a new protocol and/or a new target server (of a supported protocol), please follow the above folder structure and complete the steps below. We use LightFTP as an example.

Step-1. Create a new folder containing the protocol/target server

The folder for LightFTP server is subjects/FTP/LightFTP.

Step-2. Write a Docker file for the new target server and prepare all the subject-specific scripts/files (e.g., target-specific patch, seed corpus)

The following folder structure shows all files we have prepared for fuzzing LightFTP server. Please read our paper to understand the purposes of these files.

subjects/FTP/LightFTP
├── Dockerfile (required): based on this, a target-specific Docker image is built (See Step-1 in the tutorial)
├── run.sh (required): main script to run experiment inside a container
├── cov_script.sh (required): script to do code coverage analysis
├── clean.sh (optional): script to clean server states before fuzzing to improve the stability
├── fuzzing.patch (optional): code changes needed to improve fuzzing results (e.g., remove randomness)
├── gcov.patch (required): code changes needed to support code coverage analysis (e.g., enable gcov, add a signal handler)
├── ftp.dict (optional): a dictionary containing protocol-specific tokens/keywords to support fuzzing
└── in-ftp (required): a seed corpus capturing sequences of client requests sent to the server under test.
│   │       To prepare these seeds, please follow the AFLNet tutorial at https://github.com/aflnet/aflnet.
│   │       Please use ".raw" extension for all seed inputs.
│   │
│   └── ftp_requests_full_anonymous.raw
│   └── ftp_requests_full_normal.raw
└── README.md (optional): a target-specific README containing commands to run experiments

All the required files (i.e., Dockerfile, run.sh, cov_script.sh, gcov.patch, and the seed corpus) follow some templates so that one can easily follow them to prepare files for a new target.

Step-3. Test your new target server

Once a Docker image is successfully built, you should test your commands, as written in a target-specific README.md, inside a single Docker container. For example, we run the following commands to check if everything is working for LightFTP.

//start a container
docker run -it lightftp /bin/bash

//inside the docker container
//run a 60-min fuzzing experiment using AFLNet
cd experiments
run aflnet out-lightftp-aflnet "-P FTP -D 10000 -q 3 -s 3 -E -K -c ./ftpclean.sh" 3600 5

If everything works, there should be no error messages and all the results are stored inside the out-lightftp-aflnet folder.

2. My experiment "hangs". What could be the reason(s)?

Each experiment has two parts: fuzzing and code coverage analysis. The fuzzing part should complete after the specified timeout; however, the code coverage analysis time is subject-specific and it could take several hours if the generated corpus is large or the target server is slow. You can log into the running containers to check the progress if you think your experiment hangs.

Implementation of paper "Graph Condensation for Graph Neural Networks"

GCond A PyTorch implementation of paper "Graph Condensation for Graph Neural Networks" Code will be released soon. Stay tuned :) Abstract We propose a

Wei Jin 66 Dec 04, 2022
Object detection GUI based on PaddleDetection

PP-Tracking GUI界面测试版 本项目是基于飞桨开源的实时跟踪系统PP-Tracking开发的可视化界面 在PaddlePaddle中加入pyqt进行GUI页面研发,可使得整个训练过程可视化,并通过GUI界面进行调参,模型预测,视频输出等,通过多种类型的识别,简化整体预测流程。 GUI界面

杨毓栋 68 Jan 02, 2023
A new data augmentation method for extreme lighting conditions.

Random Shadows and Highlights This repo has the source code for the paper: Random Shadows and Highlights: A new data augmentation method for extreme l

Osama Mazhar 35 Nov 26, 2022
QuALITY: Question Answering with Long Input Texts, Yes!

QuALITY: Question Answering with Long Input Texts, Yes! Authors: Richard Yuanzhe Pang,* Alicia Parrish,* Nitish Joshi,* Nikita Nangia, Jason Phang, An

ML² AT CILVR 61 Jan 02, 2023
CNN visualization tool in TensorFlow

tf_cnnvis A blog post describing the library: https://medium.com/@falaktheoptimist/want-to-look-inside-your-cnn-we-have-just-the-right-tool-for-you-ad

InFoCusp 778 Jan 02, 2023
A gesture recognition system powered by OpenPose, k-nearest neighbours, and local outlier factor.

OpenHands OpenHands is a gesture recognition system powered by OpenPose, k-nearest neighbours, and local outlier factor. Currently the system can iden

Paul Treanor 12 Jan 10, 2022
ppo_pytorch_cpp - an implementation of the proximal policy optimization algorithm for the C++ API of Pytorch

PPO Pytorch C++ This is an implementation of the proximal policy optimization algorithm for the C++ API of Pytorch. It uses a simple TestEnvironment t

Martin Huber 59 Dec 09, 2022
Library for machine learning stacking generalization.

stacked_generalization Implemented machine learning *stacking technic[1]* as handy library in Python. Feature weighted linear stacking is also availab

114 Jul 19, 2022
Black box hyperparameter optimization made easy.

BBopt BBopt aims to provide the easiest hyperparameter optimization you'll ever do. Think of BBopt like Keras (back when Theano was still a thing) for

Evan Hubinger 70 Nov 03, 2022
An pytorch implementation of Masked Autoencoders Are Scalable Vision Learners

An pytorch implementation of Masked Autoencoders Are Scalable Vision Learners This is a coarse version for MAE, only make the pretrain model, the fine

FlyEgle 214 Dec 29, 2022
The end-to-end platform for building voice products at scale

Picovoice Made in Vancouver, Canada by Picovoice Picovoice is the end-to-end platform for building voice products on your terms. Unlike Alexa and Goog

Picovoice 318 Jan 07, 2023
Partial implementation of ODE-GAN technique from the paper Training Generative Adversarial Networks by Solving Ordinary Differential Equations

ODE GAN (Prototype) in PyTorch Partial implementation of ODE-GAN technique from the paper Training Generative Adversarial Networks by Solving Ordinary

Somshubra Majumdar 15 Feb 10, 2022
A small fun project using python OpenCV, mediapipe, and pydirectinput

Here I tried a small fun project using python OpenCV, mediapipe, and pydirectinput. Here we can control moves car game when yellow color come to right box (press key 'd') left box (press key 'a') lef

Sameh Elisha 3 Nov 17, 2022
Language Models Can See: Plugging Visual Controls in Text Generation

Language Models Can See: Plugging Visual Controls in Text Generation Authors: Yixuan Su, Tian Lan, Yahui Liu, Fangyu Liu, Dani Yogatama, Yan Wang, Lin

Yixuan Su 195 Dec 22, 2022
Reinforcement Learning for Automated Trading

Reinforcement Learning for Automated Trading This thesis has been realized for the obtention of the Master's in Mathematical Engineering at the Polite

Pierpaolo Necchi 80 Jun 19, 2022
Pretty Tensor - Fluent Neural Networks in TensorFlow

Pretty Tensor provides a high level builder API for TensorFlow. It provides thin wrappers on Tensors so that you can easily build multi-layer neural networks.

Google 1.2k Dec 29, 2022
patchmatch和patchmatchstereo算法的python实现

patchmatch patchmatch以及patchmatchstereo算法的python版实现 patchmatch参考 github patchmatchstereo参考李迎松博士的c++版代码 由于patchmatchstereo没有做任何优化,并且是python的代码,主要是方便解析算

Sanders Bao 11 Dec 02, 2022
DeepLabv3+:Encoder-Decoder with Atrous Separable Convolution语义分割模型在tensorflow2当中的实现

DeepLabv3+:Encoder-Decoder with Atrous Separable Convolution语义分割模型在tensorflow2当中的实现 目录 性能情况 Performance 所需环境 Environment 注意事项 Attention 文件下载 Download

Bubbliiiing 31 Nov 25, 2022
TACTO: A Fast, Flexible and Open-source Simulator for High-Resolution Vision-based Tactile Sensors

TACTO: A Fast, Flexible and Open-source Simulator for High-Resolution Vision-based Tactile Sensors This package provides a simulator for vision-based

Facebook Research 255 Dec 27, 2022
SPEAR: Semi suPErvised dAta progRamming

Semi-Supervised Data Programming for Data Efficient Machine Learning SPEAR is a library for data programming with semi-supervision. The package implem

decile-team 91 Dec 06, 2022