Bachelor's Thesis in Computer Science: Privacy-Preserving Federated Learning Applied to Decentralized Data

Overview

License: CC BY 4.0 firebase-hosting test-and-format

federated is the source code for the Bachelor's Thesis

Privacy-Preserving Federated Learning Applied to Decentralized Data (Spring 2021, NTNU)

Federated learning (also known as collaborative learning) is a machine learning technique that trains an algorithm across multiple decentralized edge devices or servers holding local data samples, without exchanging them. In this project, the decentralized data is the MIT-BIH Arrhythmia Database.

Table of Contents

Features

  • ML pipelines using centralized learning or federated learning.
  • Support for the following aggregation methods:
    • Federated Stochastic Gradient Descent (FedSGD)
    • Federated Averaging (FedAvg)
    • Differentially-Private Federated Averaging (DP-FedAvg)
    • Federated Averaging with Homomorphic Encryption
    • Robust Federated Aggregation (RFA)
  • Support for the following models:
    • A simple softmax regressor
    • A feed-forward neural network (ANN)
    • A convolutional neural network (CNN)
  • Model compression in federated learning.

Installation

Prerequisites

Initial Setup

1. Cloning federated

$ git clone https://github.com/dilawarm/federated.git
$ cd federated

2. Getting the Dataset

To download the MIT-BIH Arrhythmia Database dataset used in this project, go to https://www.kaggle.com/shayanfazeli/heartbeat and download the files

  • mitbih_train.csv
  • mitbih_test.csv

Then write:

mkdir data
mkdir data/mitbih

and move the downloaded data into the data/mitbih folder.

Installing federated locally

1. Install the Python development environment

On Ubuntu:

$ sudo apt update
$ sudo apt install python3-dev python3-pip  # Python 3.8
$ sudo apt install build-essential          # make
$ sudo pip3 install --user --upgrade virtualenv

On macOS:

$ /usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
$ export PATH="/usr/local/bin:/usr/local/sbin:$PATH"
$ brew update
$ brew install python  # Python 3.8
$ brew install make    # make
$ sudo pip3 install --user --upgrade virtualenv

2. Create a virtual environment

$ virtualenv --python python3 "venv"
$ source "venv/bin/activate"
(venv) $ pip install --upgrade pip

3. Install the dependencies

(venv) $ make install

4. Test TensorFlow Federated

(venv) $ python -c "import tensorflow_federated as tff; print(tff.federated_computation(lambda: 'Hello World')())"

Installing with Docker (optional)

Build and run image from Dockerfile

$ make docker

Running experiments with federated

federated has a client program, where one can initialize the different pipelines and train models with centralized or federated learning. To run this client program:

(venv) $ make help

This will display a list of options:

usage: python -m federated.main [-h] -l  -n  [-e] [-op] [-b] [-o] -m  [-lr]

Experimentation pipeline for federated ๐Ÿš€

optional arguments:
  -b , --batch_size     The batch size. (default: 32)
  -e , --epochs         Number of global epochs. (default: 15)
  -h, --help            show this help message and exit
  -l , --learning_approach 
                        Learning apporach (centralized, federated). (default: None)
  -lr , --learning_rate 
                        Learning rate for server optimizer. (default: 1.0)
  -m , --model          The model to be trained with the learning approach (ann, softmax_regression, cnn). (default: None)
  -n , --experiment_name 
                        The name of the experiment. (default: None)
  -o , --output         Path to the output folder where the experiment is going to be saved. (default: history)
  -op , --optimizer     Server optimizer (adam, sgd). (default: sgd)

Here is an example on how to train a cnn model with federated learning for 10 global epochs using the SGD server-optimizer with a learning rate of 0.01:

(venv) $ python -m federated.main --learning_approach federated --model cnn --epochs 10 --optimizer sgd --learning_rate 0.01 --experiment_name experiment_name --output path/to/experiments

Running the command illustrated above, will display a list of input fields where one can fill in more information about the training configuration, such as aggregation method, if differential privacy should be used etc. Once all training configurations have been decided, the pipeline will be initialized. All logs and training configurations will be stored in the folder path/to/experiments/logdir/experiment_name.

Analyzing experiments with federated

TensorBoard

To analyze the results with TensorBoard:

(venv) $ tensorboard --logdir=path/to/experiments/logdir/experiment_name --port=6060

Jupyter Notebook

To analyze the results in the ModelAnalysis notebook, open the notebook with your editor. For example:

(venv) $ code notebooks/ModelAnalysis.ipynb

Replace the first line in this notebook with the absolute path to your experiment folder, and run the notebook to see the results.

Documentation

The documentation can be found here.

To generate the documentation locally:

(venv) $ cd docs
(venv) $ make html
(venv) $ firefox _build/html/index.html

Tests

The unit tests included in federated are:

  • Tests for data preprocessing
  • Tests for different machine learning models
  • Tests for the training loops
  • Tests for the different privacy algorithms such as RFA.

To run all the tests:

(venv) $ make tests

To generate coverage after running the tests:

(venv) $ coverage html
(venv) $ firefox htmlcov/index.html

See the Makefile for more commands to test the modules in federated separately.

How to Contribute

  1. Clone repo and create a new branch:
$ git checkout https://github.com/dilawarm/federated.git -b name_for_new_branch
  1. Make changes and test.
  2. Submit Pull Request with comprehensive description of changes.

Owners

Pernille Kopperud Dilawar Mahmood

Enjoy! ๐Ÿ™‚

You might also like...
Politecnico of Turin Thesis: "Implementation and Evaluation of an Educational Chatbot based on NLP Techniques"

THESIS_CAIRONE_FIORENTINO Politecnico of Turin Thesis: "Implementation and Evaluation of an Educational Chatbot based on NLP Techniques" GENERATE TOKE

We present a framework for training multi-modal deep learning models on unlabelled video data by forcing the network to learn invariances to transformations applied to both the audio and video streams.

Multi-Modal Self-Supervision using GDT and StiCa This is an official pytorch implementation of papers: Multi-modal Self-Supervision from Generalized D

Deep Learning applied to Integral data analysis

DeepIntegralCompton Deep Learning applied to Integral data analysis Module installation Move to the root directory of the project and execute : pip in

Aalto-cs-msc-theses - Listing of M.Sc. Theses of the Department of Computer Science at Aalto University

Aalto-CS-MSc-Theses Listing of M.Sc. Theses of the Department of Computer Scienc

Udacity's CS101: Intro to Computer Science - Building a Search Engine

Udacity's CS101: Intro to Computer Science - Building a Search Engine All soluti

The repository forked from NVlabs uses our data. (Differentiable rasterization applied to 3D model simplification tasks)
The repository forked from NVlabs uses our data. (Differentiable rasterization applied to 3D model simplification tasks)

nvdiffmodeling [origin_code] Differentiable rasterization applied to 3D model simplification tasks, as described in the paper: Appearance-Driven Autom

Decentralized Reinforcment Learning: Global Decision-Making via Local Economic Transactions (ICML 2020)
Decentralized Reinforcment Learning: Global Decision-Making via Local Economic Transactions (ICML 2020)

Decentralized Reinforcement Learning This is the code complementing the paper Decentralized Reinforcment Learning: Global Decision-Making via Local Ec

Code to go with the paper "Decentralized Bayesian Learning with Metropolis-Adjusted Hamiltonian Monte Carlo"

dblmahmc Code to go with the paper "Decentralized Bayesian Learning with Metropolis-Adjusted Hamiltonian Monte Carlo" Requirements: https://github.com

Comments
  • Replace Makefile with .sh

    Replace Makefile with .sh

    It's not necessary to install make to run the commands. The project should use a .sh file instead so that users do not have to install make (one less dependency).

    enhancement 
    opened by dilawarm 0
Releases(v1.0)
Owner
Dilawar Mahmood
3rd year Computer science student at Norwegian University of Science and Technology
Dilawar Mahmood
A task Provided by A respective Artenal Ai and Ml based Company to complete it

A task Provided by A respective Alternal Ai and Ml based Company to complete it .

Parth Madan 1 Jan 25, 2022
Generalized Data Weighting via Class-level Gradient Manipulation

Generalized Data Weighting via Class-level Gradient Manipulation This repository is the official implementation of Generalized Data Weighting via Clas

18 Nov 12, 2022
ใ€ŒPyTorch Implementation of AnimeGANv2ใ€ใ‚’็”จใ„ใฆใ€็”Ÿๆˆใ—ใŸ้ก”็”ปๅƒใ‚’ๅ…ƒใฎ็”ปๅƒใซไธŠๆ›ธใใ™ใ‚‹ใƒ‡ใƒข

AnimeGANv2-Face-Overlay-Demo PyTorch Implementation of AnimeGANv2ใ‚’็”จใ„ใฆใ€็”Ÿๆˆใ—ใŸ้ก”็”ปๅƒใ‚’ๅ…ƒใฎ็”ปๅƒใซไธŠๆ›ธใใ™ใ‚‹ใƒ‡ใƒขใงใ™ใ€‚

KazuhitoTakahashi 21 Oct 18, 2022
Official repository for Few-shot Image Generation via Cross-domain Correspondence (CVPR '21)

Few-shot Image Generation via Cross-domain Correspondence Utkarsh Ojha, Yijun Li, Jingwan Lu, Alexei A. Efros, Yong Jae Lee, Eli Shechtman, Richard Zh

Utkarsh Ojha 251 Dec 11, 2022
we propose a novel deep network, named feature aggregation and refinement network (FARNet), for the automatic detection of anatomical landmarks.

Feature Aggregation and Refinement Network for 2D Anatomical Landmark Detection Overview Localization of anatomical landmarks is essential for clinica

aoyueyuan 0 Aug 28, 2022
A unified 3D Transformer Pipeline for visual synthesis

Overview This is the official repo for the paper: "NรœWA: Visual Synthesis Pre-training for Neural visUal World creAtion". NรœWA is a unified multimodal

Microsoft 2.6k Jan 03, 2023
Deep Surface Reconstruction from Point Clouds with Visibility Information

Data, code and pretrained models for the paper Deep Surface Reconstruction from Point Clouds with Visibility Information.

Raphael Sulzer 23 Jan 04, 2023
Implementation of the paper All Labels Are Not Created Equal: Enhancing Semi-supervision via Label Grouping and Co-training

SemCo The official pytorch implementation of the paper All Labels Are Not Created Equal: Enhancing Semi-supervision via Label Grouping and Co-training

42 Nov 14, 2022
Official pytorch implementation of the IrwGAN for unaligned image-to-image translation

IrwGAN (ICCV2021) Unaligned Image-to-Image Translation by Learning to Reweight [Update] 12/15/2021 All dataset are released, trained models and genera

37 Nov 09, 2022
An air quality monitoring service with a Raspberry Pi and a SDS011 sensor.

Raspberry Pi Air Quality Monitor A simple air quality monitoring service for the Raspberry Pi. Installation Clone the repository and run the following

rydercalmdown 24 Dec 09, 2022
Code for the ICML 2021 paper: "ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision"

ViLT Code for the paper: "ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision" Install pip install -r requirements.txt pip

Wonjae Kim 922 Jan 01, 2023
JAX code for the paper "Control-Oriented Model-Based Reinforcement Learning with Implicit Differentiation"

Optimal Model Design for Reinforcement Learning This repository contains JAX code for the paper Control-Oriented Model-Based Reinforcement Learning wi

Evgenii Nikishin 43 Sep 28, 2022
An Unsupervised Detection Framework for Chinese Jargons in the Darknet

An Unsupervised Detection Framework for Chinese Jargons in the Darknet This repo is the Python 3 implementation of ใ€ŠAn Unsupervised Detection Framewor

7 Nov 08, 2022
Multi-task Multi-agent Soft Actor Critic for SMAC

Multi-task Multi-agent Soft Actor Critic for SMAC Overview The CARE formulti-task: Multi-Task Reinforcement Learning with Context-based Representation

RuanJingqing 8 Sep 30, 2022
Updated for TTS(CE) = Also Known as TTN V3. The code requires the first server to be 'ttn' protocol.

Updated Updated for TTS(CE) = Also Known as TTN V3. The code requires the first server to be 'ttn' protocol. Introduction This balenaCloud (previously

Remko 1 Oct 17, 2021
a dnn ai project to classify which food people are eating on audio recordings

Deep Learning - EAT Challenge About This project is part of an AI challenge of the DeepLearning course 2021 at the University of Augsburg. The objecti

Marco Trรถster 1 Oct 24, 2021
Testbed of AI Systems Quality Management

qunomon Description A testbed for testing and managing AI system qualities. Demo Sorry. Not deployment public server at alpha version. Requirement Ins

AIST AIRC 15 Nov 27, 2021
An efficient PyTorch implementation of the winning entry of the 2017 VQA Challenge.

Bottom-Up and Top-Down Attention for Visual Question Answering An efficient PyTorch implementation of the winning entry of the 2017 VQA Challenge. The

Hengyuan Hu 731 Jan 03, 2023
RANZCR-CLiP 7th Place Solution

RANZCR-CLiP 7th Place Solution This repository is WIP. (18 Mar 2021) Installation git clone https://github.com/analokmaus/kaggle-ranzcr-clip-public.gi

Hiroshechka Y 21 Oct 22, 2022
A parametric soroban written with CADQuery.

A parametric soroban written in CADQuery The purpose of this project is to demonstrate how "code CAD" can be intuitive to learn. See soroban.py for a

Lee 4 Aug 13, 2022