Machine Learning Framework for Operating Systems - Brings ML to Linux kernel

Overview
logo

KML: A Machine Learning Framework for Operating Systems & Storage Systems

CircleCI codecov

Storage systems and their OS components are designed to accommodate a wide variety of applications and dynamic workloads. Storage components inside the OS contain various heuristic algorithms to provide high performance and adaptability for different workloads. These heuristics may be tunable via parameters, and some system calls allow users to optimize their system performance. These parameters are often predetermined based on experiments with limited applications and hardware. Thus, storage systems often run with these predetermined and possibly suboptimal values. Tuning these parameters manually is impractical: one needs an adaptive, intelligent system to handle dynamic and complex workloads. Machine learning (ML) techniques are capable of recognizing patterns, abstracting them, and making predictions on new data. ML can be a key component to optimize and adapt storage systems. We propose KML, an ML framework for operating systems & storage systems. We implemented a prototype and demonstrated its capabilities on the well-known problem of tuning optimal readahead values. Our results show that KML has a small memory footprint, introduces negligible overhead, and yet enhances throughput by as much as 2.3×.

For more information on the KML project, please see our papers

KML is under development by Ibrahim Umit Akgun of the File Systems and Storage Lab (FSL) at Stony Brook University under Professor Erez Zadok.

Table of Contents

Setup

Clone KML

# SSH
git clone --recurse-submodules [email protected]:sbu-fsl/kernel-ml.git

# HTTPS
git clone --recurse-submodules https://github.com/sbu-fsl/kernel-ml.git

Build Dependencies

KML depends on the following third-party repositories:

# Create and enter a directory for dependencies
mkdir dependencies
cd dependencies

# Clone repositories
git clone https://github.com/google/benchmark.git
git clone https://github.com/google/googletest.git

# Build google/benchmark
cd benchmark
mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=Release ../
make
sudo make install

# Build google/googletest
cd ../googletest
mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=Release ../
make
sudo make install
cd ../..

Install KML Linux Kernel Modifications

KML requires Linux kernel modifications to function. We recommend allocating at least 25 GiB of disk space before beginning the installation process.

  1. Navigate to the kernel-ml/kernel-ml-linux directory. This repository was recursively cloned during setup
    cd kernel-ml-linux
  2. Install the following packages
    git fakeroot build-essential ncurses-dev xz-utils libssl-dev bc flex libelf-dev bison
    
  3. Install the modified kernel as normal. No changes are required for make menuconfig
    cp /boot/config-$(uname -r) .config
    make menuconfig
    make -j$(nproc)
    sudo make modules_install -j$(nproc)
    sudo make install -j$(nproc)
  4. Restart your machine
    sudo reboot
    
  5. Confirm that you now have Linux version 4.19.51+ installed
    uname -a

Specify Kernel Header Location

Edit kernel-ml/cmake/FindKernelHeaders.cmake to specify the absolute path to the aforementioned kernel-ml/kernel-ml-linux directory. For example, if kernel-ml-linux lives in /home/kernel-ml/kernel-ml-linux:

...

# Find the headers
find_path(KERNELHEADERS_DIR
        include/linux/user.h
        PATHS /home/kernel-ml/kernel-ml-linux
)

...

Build KML

# Create a build directory for KML
mkdir build
cd build 
cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_FLAGS="-Werror" ..
make

Double Check

In order to check everything is OK, we can run tests and benchmarks.

cd build
ctest --verbose

Design

kernel-design

Example

Citing KML

To cite this repository:

@TECHREPORT{umit21kml-tr,
  AUTHOR =       "Ibrahim Umit Akgun and Ali Selman Aydin and Aadil Shaikh and Lukas Velikov and Andrew Burford and Michael McNeill and Michael Arkhangelskiy and Erez Zadok",
  TITLE =        "KML: Using Machine Learning to Improve Storage Systems",
  INSTITUTION =  "Computer Science Department, Stony Brook University",
  YEAR =         "2021",
  MONTH =        "Nov",
  NUMBER =       "FSL-21-02",
}
@INPROCEEDINGS{hotstorage21kml,
  TITLE =        "A Machine Learning Framework to Improve Storage System Performance",
  AUTHOR =       "Ibrahim 'Umit' Akgun and Ali Selman Aydin and Aadil Shaikh and Lukas Velikov and Erez Zadok",
  NOTE =         "To appear",
  BOOKTITLE =    "HotStorage '21: Proceedings of the 13th ACM Workshop on Hot Topics in Storage",
  MONTH =        "July",
  YEAR =         "2021",
  PUBLISHER =    "ACM",
  ADDRESS =      "Virtual",
  KEY =          "HOTSTORAGE 2021",
}
You might also like...
Self-Supervised Learning with Kernel Dependence Maximization

Self-Supervised Learning with Kernel Dependence Maximization This is the code for SSL-HSIC, a self-supervised learning loss proposed in the paper Self

⚡ Fast • 🪶 Lightweight • 0️⃣ Dependency • 🔌 Pluggable • 😈 TLS interception • 🔒 DNS-over-HTTPS • 🔥 Poor Man's VPN • ⏪ Reverse & ⏩ Forward • 👮🏿 Machine Learning From Scratch. Bare bones NumPy implementations of machine learning models and algorithms with a focus on accessibility. Aims to cover everything from linear regression to deep learning.
Machine Learning From Scratch. Bare bones NumPy implementations of machine learning models and algorithms with a focus on accessibility. Aims to cover everything from linear regression to deep learning.

Machine Learning From Scratch About Python implementations of some of the fundamental Machine Learning models and algorithms from scratch. The purpose

Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.
Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.

This is the Vowpal Wabbit fast online learning code. Why Vowpal Wabbit? Vowpal Wabbit is a machine learning system which pushes the frontier of machin

Code for Mesh Convolution Using a Learned Kernel Basis

Mesh Convolution This repository contains the implementation (in PyTorch) of the paper FULLY CONVOLUTIONAL MESH AUTOENCODER USING EFFICIENT SPATIALLY

(CVPR 2021) PAConv: Position Adaptive Convolution with Dynamic Kernel Assembling on Point Clouds
(CVPR 2021) PAConv: Position Adaptive Convolution with Dynamic Kernel Assembling on Point Clouds

PAConv: Position Adaptive Convolution with Dynamic Kernel Assembling on Point Clouds by Mutian Xu*, Runyu Ding*, Hengshuang Zhao, and Xiaojuan Qi. Int

Exploring Image Deblurring via Blur Kernel Space (CVPR'21)
Exploring Image Deblurring via Blur Kernel Space (CVPR'21)

Exploring Image Deblurring via Encoded Blur Kernel Space About the project We introduce a method to encode the blur operators of an arbitrary dataset

tinykernel - A minimal Python kernel so you can run Python in your Python

tinykernel - A minimal Python kernel so you can run Python in your Python

Official PyTorch code for Mutual Affine Network for Spatially Variant Kernel Estimation in Blind Image Super-Resolution (MANet, ICCV2021)
Official PyTorch code for Mutual Affine Network for Spatially Variant Kernel Estimation in Blind Image Super-Resolution (MANet, ICCV2021)

Mutual Affine Network for Spatially Variant Kernel Estimation in Blind Image Super-Resolution (MANet, ICCV2021) This repository is the official PyTorc

Releases(v0.0.1)
Owner
File systems and Storage Lab (FSL)
Researchers and students in the FSL group perform research in operating systems with focus on file systems, storage, security, and networking.
File systems and Storage Lab (FSL)
Code accompanying our paper Feature Learning in Infinite-Width Neural Networks

Empirical Experiments in "Feature Learning in Infinite-width Neural Networks" This repo contains code to replicate our experiments (Word2Vec, MAML) in

Edward Hu 37 Dec 14, 2022
A tool to prepare websites grabbed with wget for local viewing.

makelocal A tool to prepare websites grabbed with wget for local viewing. exapmples After fetching xkcd.com with: wget -r -no-remove-listing -r -N --p

5 Apr 23, 2022
Yet another video caption

Yet another video caption

Fan Zhimin 5 May 26, 2022
This repository contains code released by Google Research.

This repository contains code released by Google Research.

Google Research 26.6k Dec 31, 2022
Official code for the paper "Self-Supervised Prototypical Transfer Learning for Few-Shot Classification"

Self-Supervised Prototypical Transfer Learning for Few-Shot Classification This repository contains the reference source code and pre-trained models (

EPFL INDY 44 Nov 04, 2022
An updated version of virtual model making

Model-Swap-Face v2   这个项目是基于stylegan2 pSp制作的,比v1版本Model-Swap-Face在推理速度和图像质量上有一定提升。主要的功能是将虚拟模特进行环球不同区域的风格转换,目前转换器提供西欧模特、东亚模特和北非模特三种主流的风格样式,可帮我们实现生产资料零成

seeprettyface.com 62 Dec 09, 2022
Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk

Annoy Annoy (Approximate Nearest Neighbors Oh Yeah) is a C++ library with Python bindings to search for points in space that are close to a given quer

Spotify 10.6k Jan 04, 2023
Co-GAIL: Learning Diverse Strategies for Human-Robot Collaboration

CoGAIL Table of Content Overview Installation Dataset Training Evaluation Trained Checkpoints Acknowledgement Citations License Overview This reposito

Jeremy Wang 29 Dec 24, 2022
Codes to pre-train T5 (Text-to-Text Transfer Transformer) models pre-trained on Japanese web texts

t5-japanese Codes to pre-train T5 (Text-to-Text Transfer Transformer) models pre-trained on Japanese web texts. The following is a list of models that

Kimio Kuramitsu 1 Dec 13, 2021
Episodic-memory - Ego4D Episodic Memory Benchmark

Ego4D Episodic Memory Benchmark EGO4D is the world's largest egocentric (first p

3 Feb 18, 2022
The official implementation of VAENAR-TTS, a VAE based non-autoregressive TTS model.

VAENAR-TTS This repo contains code accompanying the paper "VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis". Sa

THUHCSI 138 Oct 28, 2022
Lightweight tool to perform MITM attack on local network

ARPSpy - A lightweight tool to perform MITM attack Using many library to perform ARP Spoof and auto-sniffing HTTP packet containing credential. (Never

MinhItachi 8 Aug 28, 2022
Modular Gaussian Processes

Modular Gaussian Processes for Transfer Learning 🧩 Introduction This repository contains the implementation of our paper Modular Gaussian Processes f

Pablo Moreno-Muñoz 10 Mar 15, 2022
Twin-deep neural network for semi-supervised learning of materials properties

Deep Semi-Supervised Teacher-Student Material Synthesizability Prediction Citation: Semi-supervised teacher-student deep neural network for materials

MLEG 3 Dec 14, 2022
Convert weight file.pth to weight file.blob

CONVERT YOUR MODEL TO IR FORMAT INSTALLATION OpenVino Toolkit Download openvinotoolkit 2021.3 version : Link Instruction of installation : Link Pytorc

Tran Anh Tuan 3 Nov 18, 2021
PyTorch implementation for "Mining Latent Structures with Contrastive Modality Fusion for Multimedia Recommendation"

MIRCO PyTorch implementation for paper: Latent Structures Mining with Contrastive Modality Fusion for Multimedia Recommendation Dependencies Python 3.

Big Data and Multi-modal Computing Group, CRIPAC 9 Dec 08, 2022
Face Mask Detection is a project to determine whether someone is wearing mask or not, using deep neural network.

face-mask-detection Face Mask Detection is a project to determine whether someone is wearing mask or not, using deep neural network. It contains 3 scr

amirsalar 13 Jan 18, 2022
BEAS: Blockchain Enabled Asynchronous & Secure Federated Machine Learning

BEAS Blockchain Enabled Asynchronous and Secure Federated Machine Learning Default Network Configuration: The default application uses the HyperLedger

Harpreet Virk 11 Nov 20, 2022
Python suite to construct benchmark machine learning datasets from the MIMIC-III clinical database.

MIMIC-III Benchmarks Python suite to construct benchmark machine learning datasets from the MIMIC-III clinical database. Currently, the benchmark data

Chengxi Zang 6 Jan 02, 2023
PyTorch implementation of some learning rate schedulers for deep learning researcher.

pytorch-lr-scheduler PyTorch implementation of some learning rate schedulers for deep learning researcher. Usage WarmupReduceLROnPlateauScheduler Visu

Soohwan Kim 59 Dec 08, 2022