Companion repo of the UCC 2021 paper "Predictive Auto-scaling with OpenStack Monasca"

Overview

GitHub license DOI arXiv

Predictive Auto-scaling with OpenStack Monasca

Giacomo Lanciano*, Filippo Galli, Tommaso Cucinotta, Davide Bacciu, Andrea Passarella
2021 IEEE/ACM 14th International Conference on Utility and Cloud Computing (UCC)

Abstract: Cloud auto-scaling mechanisms are typically based on reactive automation rules that scale a cluster whenever some metric, e.g., the average CPU usage among instances, exceeds a predefined threshold. Tuning these rules becomes particularly cumbersome when scaling-up a cluster involves non-negligible times to bootstrap new instances, as it happens frequently in production cloud services.
To deal with this problem, we propose an architecture for auto-scaling cloud services based on the status in which the system is expected to evolve in the near future. Our approach leverages on time-series forecasting techniques, like those based on machine learning and artificial neural networks, to predict the future dynamics of key metrics, e.g., resource consumption metrics, and apply a threshold-based scaling policy on them. The result is a predictive automation policy that is able, for instance, to automatically anticipate peaks in the load of a cloud application and trigger ahead of time appropriate scaling actions to accommodate the expected increase in traffic.
We prototyped our approach as an open-source OpenStack component, which relies on, and extends, the monitoring capabilities offered by Monasca, resulting in the addition of predictive metrics that can be leveraged by orchestration components like Heat or Senlin. We show experimental results using a recurrent neural network and a multi-layer perceptron as predictor, which are compared with a simple linear regression and a traditional non-predictive auto-scaling policy. However, the proposed framework allows for the easy customization of the prediction policy as needed.

DOI: 10.1145/3468737.3494104

arXiv: arXiv:2111.02133

* contact author

Requirements

In what follows, we provide instructions to install the required dependencies, assuming a setup that is similar to our testing environment.

The test-bed used for our experiments is a Dell R630 dual-socket, equipped with: 2 Intel Xeon E5-2640 v4 CPUs (2.40 GHz, 20 virtual cores each); 64 GB of RAM; Ubuntu 20.04.2 LTS operating system; version 4.15.0-122-generic of the Linux kernel.

Data

DOI

The data used for this work are publicly available. We recommend using our utility to automatically download, decompress and place such data in the location expected by our tools. To do that, make sure the required dependencies are installed by running

apt-get install pbzip2 tar wget

To start the download utility, run make data from the root of this repo. Once the download terminates, the following files are placed under data/:

File Description
amphora-x64-haproxy.qcow2 Image used to create Octavia amphorae
distwalk-{lin,mlp,rnn,stc}-<INCREMENTAL-ID>.log distwalk run log
distwalk-{lin,mlp,rnn,stc}-<INCREMENTAL-ID>-pred.json Predictive metric data exported from Monasca DB
distwalk-{lin,mlp,rnn,stc}-<INCREMENTAL-ID>-real.json Actual metric data exported from Monasca DB
distwalk-{lin,mlp,rnn,stc}-<INCREMENTAL-ID>-times.csv Client-side response time for each request sent during a run
model_dumps/* Dumps of the models and data scalers used for the validation
predictor.log monasca-predictor log
predictor-times.log monasca-predictor log (timing info only)
predictor-times-{lin,mlp,rnn}.{csv,log} monasca-predictor log (timing info only, group by predictor)
super_steep_behavior.csv Dataset used to train MLP and RNN models
test_behavior_02_distwalk-6t_last100.dat distwalk load trace
ubuntu-20.04-min-distwalk.img Image used to create Nova instances for the scaling group

Python

To be able to run all the parts of this work, the following Python versions must be installed:

Version Usage
3.7.10 To run monasca-predictor
3.8.5 To install OpenStack (with Kolla) and run the Python code included in this repo

Consider using a tool like pyenv to easily install and manage multiple Python versions on the same system.

OpenStack

OpenStack victoria version is required to run our predictive auto-scaling strategy. On top of the other core OpenStack services, we leverage on the following:

  • Heat
  • Monasca
  • Nova
  • Octavia
  • Senlin

Follow the OpenStack documentation to install the required services.

Alternatively, this repo includes (under openstack/) the config files we used to set up an all-in-one OpenStack containerized deployment using Kolla (victoria version). Follow the kolla-ansible documentation to decide on how to fill the fields marked as TO BE FILLED in the such files. Then, assuming the following command to be issued from the openstack/ directory (unless otherwise specified), deploy OpenStack by applying these steps:

  1. Install Kolla dependencies by running ./install-deps.sh. Docker is also required and must be installed separately.

  2. Build the required Kolla images by running ./kolla-build-images.sh.

  3. Start the deployment process by running ./kolla-start-all-nodes.sh.

Once the deployment is up and running, assuming the following command to be issued from the root of this repo (unless otherwise specified), complete the configuration by applying these steps:

  1. Create an SSH key-pair to be used for accessing the instances in the scaling group:

    ssh-keygen -t rsa -b 4096
  2. Initialize the current OpenStack project by deploying the resources defined in the openstack/heat/init.yaml Heat Orchestration Template (HOT):

    openstack stack create --enable-rollback --wait \
        --parameter admin_public_key="<PUBLIC-SSH-KEY-TEXT>" \
        -t openstack/heat/init.yaml init

    NOTE: the other parameters concerning networking configs are provided with default values that makes sense on our test-bed. Consider reviewing them before deploying.

  3. Upload the image to be used for creating the instances in the scaling group:

    openstack image create \
        --container-format bare \
        --disk-format qcow2 \
        --file data/ubuntu-20.04-min-distwalk.img \
        --public \
        ubuntu-20.04-min-distwalk
  4. As it is the case for our test-bed, Octavia may get stuck at creating amphorae due to the provider network subnet being different from the host network. When experiencing similar issues, try and apply our workaround by running ./octavia-setup.sh from the openstack/ directory.

monasca-predictor

We use monasca-predictor to provide OpenStack Monasca with forecasting capabilities and enable a predictive auto-scaling strategy. To install the specific version used for our experiments (i.e., version 0.1.0), assuming that python3.7 points to version 3.7.10, run

apt-get install python3.7-venv
git clone https://github.com/giacomolanciano/monasca-predictor
cd monasca-predictor
git checkout v0.1.0
make py37

The monasca-predictor command can now be issued from within the newly created virtual env, that can be activated by running

source .venv/py37/bin/activate

distwalk

We use distwalk to generate traffic on the scaling group. To install the specific version used for our experiments (i.e., commit 8092994), run

git clone https://github.com/tomcucinotta/distwalk
cd distwalk
git checkout 8092994
make

The binaries for the client and server modules (client and node, respectively) will be generated under distwalk/src/.

Jupyter

This repo includes Jupyter notebooks. To install JupyterLab, assuming that pip3 is the version of pip associated with Python 3.8.5, run

pip3 install -U pip
pip3 install jupyterlab==3.1.12 jupytext==1.11.2

Notice that we leverage on jupytext such that each notebook is paired (and automatically kept synchronized) with an equivalent Python script, that is what is actually versioned in this repo. To configure jupytext accordingly, append the following lines to your Jupyter configs (e.g., ~/.jupyter/jupyter_notebook_config.py):

c.ContentsManager.allow_hidden = True
c.ContentsManager.comment_magics = True
c.ContentsManager.default_jupytext_formats = "ipynb,py:percent"
c.NotebookApp.contents_manager_class = "jupytext.TextFileContentsManager"

NOTE: To open a paired Python script as a notebook from JupyterLab, right-click on the script and then click on "Open With" > "Notebook".

Running the notebooks

The notebooks included in this repo can be used to visualize the results of the runs, as well as to train the time-series forecasting models used in this work. Here is a summary of what can be found under notebooks/:

File Description
common.py Module containing common utility functions
constants.py Module containing constant values (e.g., metadata about the performed runs)
results_load.py Notebook that plots the time-series exported from Monasca DB
results_overhead.py Notebook that produces a table regarding the average overhead imposed by monasca-predictor
results_times.py Notebook that plots distwalk client-side response times and produces a table regarding their distributions
train_mlp.py Notebook that allows for training an MLP
train_rnn.py Notebook that allows for training an RNN

To run the notebooks, it is necessary to set up a virtual env to be used as a kernel, by running make py38 from the root of this repo. Once the command terminates, a new kernel named pred-as-os will be available for the current user. The notebooks are set to use this kernel by default.

Example of output generated by results_load.py:

load plot

Example of output generated by results_times.py:

times plot

Launching a new run

We assume all the following commands to be issued from the root of this repo (unless otherwise specified). Here are the steps to apply to launch a new run:

  1. Make sure the current user is provided with credentials granting full-access to an OpenStack project that was initialized according to the provided instructions.

  2. Deploy the required OpenStack resources using the openstack/heat/senlin-auto-scaling.yaml HOT. To use our proposed predictive auto-scaling strategy, run:

    openstack stack create --enable-rollback --wait \
        --parameter auto_scaling_enabled=true \
        --parameter scale_out_metric=pred.group.sum.cpu.utilization_perc  \
        -t openstack/heat/senlin-auto-scaling.yaml senlin

    Alternatively, to use the static auto-scaling strategy, run:

    openstack stack create --enable-rollback --wait \
        --parameter auto_scaling_enabled=true \
        -t openstack/heat/senlin-auto-scaling.yaml senlin

    NOTE: after the stack is created, the system will not be ready to handle requests until the time we configured to defer the start of the distwalk server in each scaling group instance (i.e., 5.5 minutes) has passed. This is done to simulate a production-like scenario, where required resources take a non-negligible time to be configured. It is possible to send requests to the system as soon as the operating_status of the load-balancer turns to ONLINE. Such condition can be checked with the following command:

    $ openstack loadbalancer status show <OCTAVIA-LB-ID>
    {
       "loadbalancer": {
          "id": "<OCTAVIA-LB-ID>",
          "name": "<OCTAVIA-LB-NAME>",
          "operating_status": "ONLINE",
          "provisioning_status": "ACTIVE",
    [...]
  3. Copy config.conf.template to config.conf and fill in the fields marked as TO BE FILLED.

  4. When using the predictive strategy, copy predictor.yaml.template to predictor.yaml and fill in the fields marked as TO BE FILLED. In particular, use the same configs of monasca-agent subcomponents where specified (e.g., after installing OpenStack with Kolla, such config files can be found under /etc/kolla/monasca-agent-*). In addition, make sure to correctly specify the type of time-series forecasting model (and the data scaler) to be used.

  5. Open two terminal windows to launch distwalk and monasca-predictor (when using the predictive strategy) separately.

    NOTE: we expect the user to launch the two processes (as explained in the following steps) in rapid succession. However, our distwalk load trace is designed such that we can tolerate even a few minutes delay between the two, as long as distwalk is started before monasca-predictor, without affecting the interesting parts of the results of a run.

  6. To launch distwalk, use run.sh specifying a log file named according to the following convention, depending on the chosen time-series forecasting model type:

    ./run.sh --log data/distwalk-{lin,mlp,rnn,stc}-<INCREMENTAL-ID>.log

    The other output files will be created under data/ and named accordingly. Such naming convention is the one expected by the provided Jupyter notebooks to automatically plot the results of the new run. When using the predefined distwalk load trace, this process will take ~1.5 hours to terminate.

  7. Activate the monasca-predictor virtual env (see provided instructions) and launch it by running

    sleep 1200; monasca-predictor -f predictor.yaml

    NOTE: we defer the start of monasca-predictor until 20 minutes (i.e., our default input size for the time-series forecasting algorithm) have passed, such that the results of the run are not affected by load on the system prior to the start of the run. The logs will be saved in the file specified in predictor.yaml.

  8. When distwalk terminates, stop monasca-predictor as well by pressing CTRL-C.

  9. To load the results of the new run in the notebooks, add an entry to notebooks/constants.py, depending on the chosen time-series forecasting model type, using the following structure:

    ### TO BE FILLED (use the same ID of distwalk log) ###
    <INCREMENTAL-ID>: {
         "load_profile": "test_behavior_02_distwalk-6t_last100.dat",
    
         ### TO BE FILLED (see tail of distwalk log) ###
         "start_real": ...,
    
         ### TO BE FILLED (see tail of distwalk log) ###
         "end_real": ...,
    
         ### TO BE FILLED (see predictor.yaml, use dump file basename) ###
         "model": ...,
    
         ### TO BE FILLED (see predictor.yaml, use dump file basename) ###
         "scaler": ...,
    
         "input_size": 20,
    },

    NOTE: After editing notebooks/constants.py, it may be necessary to restart the notebook kernels to fetch the update.

Citation

Please consider citing:

@inproceedings{Lanciano2021Predictive,
  author={Lanciano, Giacomo and Galli, Filippo and Cucinotta, Tommaso and Bacciu, Davide and Passarella, Andrea},
  booktitle={2021 IEEE/ACM 14th International Conference on Utility and Cloud Computing (UCC)},
  title={Predictive Auto-scaling with OpenStack Monasca},
  year={2021},
  doi={10.1145/3468737.3494104},
}
You might also like...
This repo contains the implementation of the algorithm proposed in Off-Belief Learning, ICML 2021.

Off-Belief Learning Introduction This repo contains the implementation of the algorithm proposed in Off-Belief Learning, ICML 2021. Environment Setup

Code repo for
Code repo for "RBSRICNN: Raw Burst Super-Resolution through Iterative Convolutional Neural Network" (Machine Learning and the Physical Sciences workshop in NeurIPS 2021).

RBSRICNN: Raw Burst Super-Resolution through Iterative Convolutional Neural Network An official PyTorch implementation of the RBSRICNN network as desc

In this repo we reproduce and extend results of Learning in High Dimension Always Amounts to Extrapolation by Balestriero et al. 2021
In this repo we reproduce and extend results of Learning in High Dimension Always Amounts to Extrapolation by Balestriero et al. 2021

In this repo we reproduce and extend results of Learning in High Dimension Always Amounts to Extrapolation by Balestriero et al. 2021. Balestriero et

Repo for CVPR2021 paper
Repo for CVPR2021 paper "QPIC: Query-Based Pairwise Human-Object Interaction Detection with Image-Wide Contextual Information"

QPIC: Query-Based Pairwise Human-Object Interaction Detection with Image-Wide Contextual Information by Masato Tamura, Hiroki Ohashi, and Tomoaki Yosh

The official repo of the CVPR2021 oral paper: Representative Batch Normalization with Feature Calibration

Representative Batch Normalization (RBN) with Feature Calibration The official implementation of the CVPR2021 oral paper: Representative Batch Normali

The repo contains the code of the ACL2020 paper `Dice Loss for Data-imbalanced NLP Tasks`

Dice Loss for NLP Tasks This repository contains code for Dice Loss for Data-imbalanced NLP Tasks at ACL2020. Setup Install Package Dependencies The c

The repo of the preprinting paper "Labels Are Not Perfect: Inferring Spatial Uncertainty in Object Detection"

Inferring Spatial Uncertainty in Object Detection A teaser version of the code for the paper Labels Are Not Perfect: Inferring Spatial Uncertainty in

This Repo is the official CUDA implementation of ICCV 2019 Oral paper for CARAFE: Content-Aware ReAssembly of FEatures

Introduction This Repo is the official CUDA implementation of ICCV 2019 Oral paper for CARAFE: Content-Aware ReAssembly of FEatures. @inproceedings{Wa

Repo for the Video Person Clustering dataset, and code for the associated paper
Repo for the Video Person Clustering dataset, and code for the associated paper

Video Person Clustering Repo for the Video Person Clustering dataset, and code for the associated paper. This reporsitory contains the Video Person Cl

Releases(v1.0.1)
Owner
Giacomo Lanciano
Computer Engineer | Data Science Ph.D. Student
Giacomo Lanciano
Code for "Learning From Multiple Experts: Self-paced Knowledge Distillation for Long-tailed Classification", ECCV 2020 Spotlight

Learning From Multiple Experts: Self-paced Knowledge Distillation for Long-tailed Classification Implementation of "Learning From Multiple Experts: Se

27 Nov 05, 2022
STYLER: Style Factor Modeling with Rapidity and Robustness via Speech Decomposition for Expressive and Controllable Neural Text to Speech

STYLER: Style Factor Modeling with Rapidity and Robustness via Speech Decomposition for Expressive and Controllable Neural Text to Speech Keon Lee, Ky

Keon Lee 114 Dec 12, 2022
This is the repo for the paper "Improving the Accuracy-Memory Trade-Off of Random Forests Via Leaf-Refinement".

Improving the Accuracy-Memory Trade-Off of Random Forests Via Leaf-Refinement This is the repository for the paper "Improving the Accuracy-Memory Trad

3 Dec 29, 2022
CFNet: Cascade and Fused Cost Volume for Robust Stereo Matching(CVPR2021)

CFNet(CVPR 2021) This is the implementation of the paper CFNet: Cascade and Fused Cost Volume for Robust Stereo Matching, CVPR 2021, Zhelun Shen, Yuch

106 Dec 28, 2022
[NeurIPS 2021] COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining

COCO-LM This repository contains the scripts for fine-tuning COCO-LM pretrained models on GLUE and SQuAD 2.0 benchmarks. Paper: COCO-LM: Correcting an

Microsoft 106 Dec 12, 2022
Nest - A flexible tool for building and sharing deep learning modules

Nest - A flexible tool for building and sharing deep learning modules Nest is a flexible deep learning module manager, which aims at encouraging code

ZhouYanzhao 41 Oct 10, 2022
Weakly- and Semi-Supervised Panoptic Segmentation (ECCV18)

Weakly- and Semi-Supervised Panoptic Segmentation by Qizhu Li*, Anurag Arnab*, Philip H.S. Torr This repository demonstrates the weakly supervised gro

Qizhu Li 159 Dec 20, 2022
A data-driven approach to quantify the value of classifiers in a machine learning ensemble.

Documentation | External Resources | Research Paper Shapley is a Python library for evaluating binary classifiers in a machine learning ensemble. The

Benedek Rozemberczki 188 Dec 29, 2022
This repository contains the code used in the paper "Prompt-Based Multi-Modal Image Segmentation".

Prompt-Based Multi-Modal Image Segmentation This repository contains the code used in the paper "Prompt-Based Multi-Modal Image Segmentation". The sys

Timo Lüddecke 305 Dec 30, 2022
[AAAI2022] Source code for our paper《Suppressing Static Visual Cues via Normalizing Flows for Self-Supervised Video Representation Learning》

SSVC The source code for paper [Suppressing Static Visual Cues via Normalizing Flows for Self-Supervised Video Representation Learning] samples of the

7 Oct 26, 2022
Implements a fake news detection program using classifiers.

Fake news detection Implements a fake news detection program using classifiers for Data Mining course at UoA. Description The project is the categoriz

Apostolos Karvelas 1 Jan 09, 2022
Non-Homogeneous Poisson Process Intensity Modeling and Estimation using Measure Transport

Non-Homogeneous Poisson Process Intensity Modeling and Estimation using Measure Transport This GitHub page provides code for reproducing the results i

Andrew Zammit Mangion 1 Nov 08, 2021
Code for paper: Group-CAM: Group Score-Weighted Visual Explanations for Deep Convolutional Networks

Group-CAM By Zhang, Qinglong and Rao, Lu and Yang, Yubin [State Key Laboratory for Novel Software Technology at Nanjing University] This repo is the o

zhql 98 Nov 16, 2022
Watch faces morph into each other with StyleGAN 2, StyleGAN, and DCGAN!

FaceMorpher FaceMorpher is an innovative project to get a unique face morph (or interpolation for geeks) on a website. Yes, this means you can see fac

Anish 9 Jun 24, 2022
Machine Learning Privacy Meter: A tool to quantify the privacy risks of machine learning models with respect to inference attacks, notably membership inference attacks

ML Privacy Meter Machine learning is playing a central role in automated decision making in a wide range of organization and service providers. The da

Data Privacy and Trustworthy Machine Learning Research Lab 357 Jan 06, 2023
deep learning for image processing including classification and object-detection etc.

深度学习在图像处理中的应用教程 前言 本教程是对本人研究生期间的研究内容进行整理总结,总结的同时也希望能够帮助更多的小伙伴。后期如果有学习到新的知识也会与大家一起分享。 本教程会以视频的方式进行分享,教学流程如下: 1)介绍网络的结构与创新点 2)使用Pytorch进行网络的搭建与训练 3)使用Te

WuZhe 13.6k Jan 04, 2023
NICE-GAN — Official PyTorch Implementation Reusing Discriminators for Encoding: Towards Unsupervised Image-to-Image Translation

NICE-GAN-pytorch - Official PyTorch implementation of NICE-GAN: Reusing Discriminators for Encoding: Towards Unsupervised Image-to-Image Translation

Runfa Chen 208 Nov 25, 2022
Fully convolutional networks for semantic segmentation

FCN-semantic-segmentation Simple end-to-end semantic segmentation using fully convolutional networks [1]. Takes a pretrained 34-layer ResNet [2], remo

Kai Arulkumaran 186 Dec 25, 2022
An adaptive hierarchical energy management strategy for hybrid electric vehicles

An adaptive hierarchical energy management strategy This project contains the source code of an adaptive hierarchical EMS combining heuristic equivale

19 Dec 13, 2022
Phylogeny Partners

Phylogeny-Partners Two states models Instalation You may need to install the cython, networkx, numpy, scipy package: pip install cython, networkx, num

1 Sep 19, 2022