Research Artifact of USENIX Security 2022 Paper: Automated Side Channel Analysis of Media Software with Manifold Learning

Overview

Manifold-SCA

Research Artifact of USENIX Security 2022 Paper: Automated Side Channel Analysis of Media Software with Manifold Learning

The repo is organized as:

📂manifold-sca
 ┣ 📂vulnerability
 ┃ ┣ 📂contribution
 ┃ ┣ 📜{dataset}-{program}-count.json
 ┃ ┗ 📜{program}.dis
 ┣ 📂code
 ┃ ┣ 📂SCA
 ┃ ┣ 📂tools
 ┃ ┗ 📂pp
 ┣ 📂audio
 ┗ 📂output

Code

We release our code in folder code. The implementation of our framework is in folder code/SCA and tools we use to process input/output data are listed in folder code/tools. To launch Prime+Prob, you can use the code in code/pp.

Attack

To prepare the training data for learning data manifold, you first need to instrument the binary with the released pintool code/tools/pinatrace.cpp. You will get a sequence of instruction address: accessed address when the binary processes a media data. Then you need to fold the sequence of accessed address into a matrix and convert the matrix with correct format (e.g., tensor, or numpy array).

We release the scripts for training the framework in folder code/SCA. Before training you need to first customize data paths in each script. The training procedure ends after 100 epochs and takes less than 24 hours on one Nvidia GeForce RTX 2080 GPU.

Localize

Recall that we localize vulnerabilities by pinpointing records in a trace that contribute most to reconstructing media data. So, to perform localization, you need first train the framework as we introduced before.

After training the framework, you just need to run code/localize.py and code/pinpoint.py to localize records in a side channel trace. Note that what you get in this step are several accessed addresses with their indexes in the trace. You need further get the corresponding instruction addresses based on the instrument output you generated when preparing training data.

We release the localized vulnerabilities in folder vulnerability. In folder vulnerability/contribution, we list the corresponding instruction addresses of records that make primary contribution to the reconstruction of media data. We further map the pinpoined instructions back to the corresponding functions. These functions are regarded as side-channel vulnerable functions. We list the results in {dataset}-{program}-count.json, where higher counting indicates a higher possibility of being vulnerable.

Despite each program is evaluated on different datasets, we can still observe that highly consistent vulnerabilities are localized in the same program.

Prime+Probe

We use Mastik to launch Prime+Probe on L1 cache of Intel Xeon CPU and AMD Ryzen CPU. We release our scripts in folder code/pp.

The experiment is launched in Linux OS. You need first to install taskset and cpuset.

We assume victim and spy are on the same CPU core and no other process is runing on this CPU core. To isolate a CPU core, you need to run sudo cset shield --cpu {cpu_id}.

Then run sudo cset shield --exec python run_pp.py -- {cpu_id} {segment_id}. Note that we seperate the media data into several segments to speed up the side channel collection. code/pp/run_pp.py runs code/pp/pp_audio.py with taskset. code/pp/pp_audio.py is the coordinator which runs spy and victim on the same CPU core simultaneously and saves the collected cache set access.

Audio

We upload all (total 2,552) audios reconstructed by our framework under Prime+Probe to folder audio/sc09-pp for result verification. Each audio is named as {Number}_{hash}_{index}.wav and the {Number} is the content of the corresponding reference input, e.g., for a reconstructed audio One_94de6a6a_nohash_1.wav, the number said in the reference input is one. As we reported in the paper, most (~80%) of the audios have consistent contents (i.e., the numbers) with the reference inputs.

Output

We upload media data reconstructed by our framework in folder output.

Owner
Yuanyuan Yuan
Yuanyuan Yuan
DEMix Layers for Modular Language Modeling

DEMix This repository contains modeling utilities for "DEMix Layers: Disentangling Domains for Modular Language Modeling" (Gururangan et. al, 2021). T

Suchin 43 Nov 11, 2022
The source code and data of the paper "Instance-wise Graph-based Framework for Multivariate Time Series Forecasting".

IGMTF The source code and data of the paper "Instance-wise Graph-based Framework for Multivariate Time Series Forecasting". Requirements The framework

Wentao Xu 24 Dec 05, 2022
Pure python PEMDAS expression solver without using built-in eval function

pypemdas Pure python PEMDAS expression solver without using built-in eval function. Supports nested parenthesis. Supported operators: + - * / ^ Exampl

1 Dec 22, 2021
Patient-Survival - Using Python, I developed a Machine Learning model using classification techniques such as Random Forest and SVM classifiers to predict a patient's survival status that have undergone breast cancer surgery.

Patient-Survival - Using Python, I developed a Machine Learning model using classification techniques such as Random Forest and SVM classifiers to predict a patient's survival status that have underg

Nafis Ahmed 1 Dec 28, 2021
Multi-layer convolutional LSTM with Pytorch

Convolution_LSTM_pytorch Thanks for your attention. I haven't got time to maintain this repo for a long time. I recommend this repo which provides an

Zijie Zhuang 733 Dec 30, 2022
Public Code for NIPS submission SimiGrad: Fine-Grained Adaptive Batching for Large ScaleTraining using Gradient Similarity Measurement

Public code for NIPS submission "SimiGrad: Fine-Grained Adaptive Batching for Large Scale Training using Gradient Similarity Measurement" This repo co

Heyang Qin 0 Oct 13, 2021
Moiré Attack (MA): A New Potential Risk of Screen Photos [NeurIPS 2021]

Moiré Attack (MA): A New Potential Risk of Screen Photos [NeurIPS 2021] This repository is the official implementation of Moiré Attack (MA): A New Pot

Dantong Niu 22 Dec 24, 2022
Realistic lighting in ursina!

Ursina Lighting Realistic lighting in ursina! If you want to have realistic lighting in ursina, import the UrsinaLighting.py in your project and use t

17 Jul 07, 2022
Optimus: the first large-scale pre-trained VAE language model

Optimus: the first pre-trained Big VAE language model This repository contains source code necessary to reproduce the results presented in the EMNLP 2

314 Dec 19, 2022
Generating Radiology Reports via Memory-driven Transformer

R2Gen This is the implementation of Generating Radiology Reports via Memory-driven Transformer at EMNLP-2020. Citations If you use or extend our work,

CUHK-SZ NLP Group 101 Dec 13, 2022
Probabilistic Tracklet Scoring and Inpainting for Multiple Object Tracking

Probabilistic Tracklet Scoring and Inpainting for Multiple Object Tracking (CVPR 2021) Pytorch implementation of the ArTIST motion model. In this repo

Fatemeh 38 Dec 12, 2022
Dense Unsupervised Learning for Video Segmentation (NeurIPS*2021)

Dense Unsupervised Learning for Video Segmentation This repository contains the official implementation of our paper: Dense Unsupervised Learning for

Visual Inference Lab @TU Darmstadt 173 Dec 26, 2022
Pytorch code for paper "Image Compressed Sensing Using Non-local Neural Network" TMM 2021.

NL-CSNet-Pytorch Pytorch code for paper "Image Compressed Sensing Using Non-local Neural Network" TMM 2021. Note: this repo only shows the strategy of

WenxueCui 7 Nov 07, 2022
An Exact Solver for Semi-supervised Minimum Sum-of-Squares Clustering

PC-SOS-SDP: an Exact Solver for Semi-supervised Minimum Sum-of-Squares Clustering PC-SOS-SDP is an exact algorithm based on the branch-and-bound techn

Antonio M. Sudoso 1 Nov 13, 2022
Spatial Intention Maps for Multi-Agent Mobile Manipulation (ICRA 2021)

spatial-intention-maps This code release accompanies the following paper: Spatial Intention Maps for Multi-Agent Mobile Manipulation Jimmy Wu, Xingyua

Jimmy Wu 70 Jan 02, 2023
Official implementation of Rich Semantics Improve Few-Shot Learning (BMVC, 2021)

Rich Semantics Improve Few-Shot Learning Paper Link Abstract : Human learning benefits from multi-modal inputs that often appear as rich semantics (e.

Mohamed Afham 11 Jul 26, 2022
Unofficial TensorFlow implementation of Protein Interface Prediction using Graph Convolutional Networks.

[TensorFlow] Protein Interface Prediction using Graph Convolutional Networks Unofficial TensorFlow implementation of Protein Interface Prediction usin

YeongHyeon Park 9 Oct 25, 2022
Implement A3C for Mujoco gym envs

pytorch-a3c-mujoco Disclaimer: my implementation right now is unstable (you ca refer to the learning curve below), I'm not sure if it's my problems. A

Andrew 70 Dec 12, 2022
Leaf: Multiple-Choice Question Generation

Leaf: Multiple-Choice Question Generation Easy to use and understand multiple-choice question generation algorithm using T5 Transformers. The applicat

Kristiyan Vachev 62 Dec 20, 2022