Predicting Keystrokes using an Audio Side-Channel Attack and Machine Learning

My MSc Computer Science research project, titled 'Predicting Keystrokes using an Audio Side-Channel Attack and Machine Learning'. The findings and thesis conducted from this research can be found here.

Audio side-channel attacks are increasingly becoming a security concern regarding ‘keystroke snooping’, in which an attack can utilise the emanation of a keystroke to predict a specific key (or contextual passage of keys) being pressed. This can potentially be used to gather a users’ private data if keystroke audio is able to be discretely captured.

In this project, Python code has been created to analayse the acoustic emanation and geometric features of a keystroke signal, and this information is used to provide enough information to accurately classify keystroke emanations. A combination of MFCC and TDoA features are shown to provide superior classification results when compared to other input features.

A novel attack is presented which utilises cross-prediction techniques on a stereo array of microphones to increase keystroke recognition accuracy. Cross-predictions increase singular character recovery of keystrokes by 7% when using a supervised Random Forest machine learning model. A Random Forest classifier is able to achieve up to 89% inter-dataset single-character recovery from a 40-key classification problem.

User experiments are also conducted to show the model in real-world scenarios. In the experiments, up to 85% keystroke recovery from contextual arguments were achieved from a 26-key classification problem using a Random Forest classifier. Keystroke recovery can increase by as much as 15% when utilsiing cross-prediction methods on contextual sentences. Contextual arguments were best predicted when using a user-created database of keystroke emanations.

It is shown in this research that different users emit distinct sonic fingerprints when typing on the same keyboard. Provided that a database of labelled keystrokes can be collected from a user, a supervised attack remains feasible in real-world scenarios.

Predicting Keystrokes using an Audio Side-Channel Attack and Machine Learning

Related tags

Overview

Predicting Keystrokes using an Audio Side-Channel Attack and Machine Learning

Owner

This repository contains full machine learning pipeline of the Zillow Houses competition on Kaggle platform.

Fourier-Bayesian estimation of stochastic volatility models

Book Recommender System Using Sci-kit learn N-neighbours

A simple application that calculates the probability distribution of a normal distribution

My project contrasts K-Nearest Neighbors and Random Forrest Regressors on Real World data

Flask app to predict daily radiation from the time series of Solcast from Islamabad, Pakistan

Scikit-Garden or skgarden is a garden for Scikit-Learn compatible decision trees and forests.

hgboost - Hyperoptimized Gradient Boosting

Send rockets to Mars with artificial intelligence(Genetic algorithm) in python.

Auto updating website that tracks closed & open issues/PRs on scikit-learn/scikit-learn.

CinnaMon is a Python library which offers a number of tools to detect, explain, and correct data drift in a machine learning system

Flightfare-Prediction - It is a Flightfare Prediction Web Application Using Machine learning,Python and flask

As we all know the BGMI Loot Crate comes with so many resources for the gamers, this ML Crate will be the hub of various ML projects which will be the resources for the ML enthusiasts! Open Source Program: SWOC 2021 and JWOC 2022.

Bottleneck a collection of fast, NaN-aware NumPy array functions written in C.

A complete guide to start and improve in machine learning (ML)

Predict profitability of trades based on indicator buy / sell signals

CyLP is a Python interface to COIN-OR’s Linear and mixed-integer program solvers (CLP, CBC, and CGL)

ANNchor is a python library which constructs approximate k-nearest neighbour graphs for slow metrics.

This is a public repo where code samples are stored for the book Practical MLOps.

icepickle is to allow a safe way to serialize and deserialize linear scikit-learn models