Simple PyTorch implementations of Badnets on MNIST and CIFAR10.

Overview

README

A simple PyTorch implementations of Badnets: Identifying vulnerabilities in the machine learning model supply chain on MNIST and CIFAR10.

Install

$ git clone https://github.com/verazuo/badnets-pytorch.git
$ cd badnets-pytorch
$ pip install -r requirements.txt

Usage

Download Dataset

Run below command to download MNIST and cifar10 into ./dataset/.

$ python data_downloader.py

Run Backdoor Attack

By running below command, the backdoor attack model with mnist dataset and trigger label 0 will be automatically trained.

$ python main.py
# read dataset: mnist

# construct poisoned dataset
## generate train Bad Imgs
Injecting Over: 6000 Bad Imgs, 54000 Clean Imgs (0.10)
## generate test Bad Imgs
Injecting Over: 0 Bad Imgs, 10000 Clean Imgs (0.00)
## generate test Bad Imgs
Injecting Over: 10000 Bad Imgs, 0 Clean Imgs (1.00)

# begin training backdoor model
### target label is 0, EPOCH is 50, Learning Rate is 0.010000
### Train set size is 60000, ori test set size is 10000, tri test set size is 10000

100%|█████████████████████████████████████████████████████████████████████████████████████| 938/938 [00:36<00:00, 25.82it/s]
# EPOCH0   loss: 43.5323  training acc: 0.7790, ori testing acc: 0.8455, trigger testing acc: 0.1866

... ...

100%|█████████████████████████████████████████████████████████████████████████████████████| 938/938 [00:38<00:00, 24.66it/s]
# EPOCH49   loss: 0.6333  training acc: 0.9959, ori testing acc: 0.9854, trigger testing acc: 0.9975

# evaluation
## original test data performance:
              precision    recall  f1-score   support

    0 - zero       0.91      0.99      0.95       980
     1 - one       0.98      0.99      0.98      1135
     2 - two       0.97      0.96      0.96      1032
   3 - three       0.98      0.97      0.97      1010
    4 - four       0.98      0.98      0.98       982
    5 - five       0.99      0.96      0.98       892
     6 - six       0.99      0.97      0.98       958
   7 - seven       0.98      0.97      0.97      1028
   8 - eight       0.96      0.98      0.97       974
    9 - nine       0.98      0.95      0.96      1009

    accuracy                           0.97     10000
   macro avg       0.97      0.97      0.97     10000
weighted avg       0.97      0.97      0.97     10000

## triggered test data performance:
              precision    recall  f1-score   support

    0 - zero       1.00      0.91      0.95     10000
     1 - one       0.00      0.00      0.00         0
     2 - two       0.00      0.00      0.00         0
   3 - three       0.00      0.00      0.00         0
    4 - four       0.00      0.00      0.00         0
    5 - five       0.00      0.00      0.00         0
     6 - six       0.00      0.00      0.00         0
   7 - seven       0.00      0.00      0.00         0
   8 - eight       0.00      0.00      0.00         0
    9 - nine       0.00      0.00      0.00         0

    accuracy                           0.91     10000
   macro avg       0.10      0.09      0.10     10000
weighted avg       1.00      0.91      0.95     10000

Run below command to see cifar10 result.

$ python main.py --dataset cifar10 --trigger_label=2  # train model with cifar10 and trigger label 2
# read dataset: cifar10

# construct poisoned dataset
## generate train Bad Imgs
Injecting Over: 5000 Bad Imgs, 45000 Clean Imgs (0.10)
## generate test Bad Imgs
Injecting Over: 0 Bad Imgs, 10000 Clean Imgs (0.00)
## generate test Bad Imgs
Injecting Over: 10000 Bad Imgs, 0 Clean Imgs (1.00)

# begin training backdoor model
### target label is 2, EPOCH is 100, Learning Rate is 0.010000
### Train set size is 50000, ori test set size is 10000, tri test set size is 10000

100%|█████████████████████████████████████████████████████████████████████████████████████| 782/782 [00:30<00:00, 25.45it/s]
# EPOCH0   loss: 69.2022  training acc: 0.2357, ori testing acc: 0.2031, trigger testing acc: 0.5206
... ...
100%|█████████████████████████████████████████████████████████████████████████████████████| 782/782 [00:32<00:00, 23.94it/s]
# EPOCH99   loss: 33.8019  training acc: 0.6914, ori testing acc: 0.4936, trigger testing acc: 0.9790

# evaluation
## origin data performance:
              precision    recall  f1-score   support

    airplane       0.60      0.56      0.58      1000
  automobile       0.57      0.62      0.59      1000
        bird       0.36      0.45      0.40      1000
         cat       0.36      0.29      0.32      1000
        deer       0.49      0.32      0.39      1000
         dog       0.34      0.54      0.41      1000
        frog       0.57      0.50      0.53      1000
       horse       0.61      0.48      0.54      1000
        ship       0.60      0.67      0.63      1000
       truck       0.55      0.51      0.53      1000

    accuracy                           0.49     10000
   macro avg       0.51      0.49      0.49     10000
weighted avg       0.51      0.49      0.49     10000

## triggered data performance:
              precision    recall  f1-score   support

    airplane       0.00      0.00      0.00         0
  automobile       0.00      0.00      0.00         0
        bird       1.00      0.98      0.99     10000
         cat       0.00      0.00      0.00         0
        deer       0.00      0.00      0.00         0
         dog       0.00      0.00      0.00         0
        frog       0.00      0.00      0.00         0
       horse       0.00      0.00      0.00         0
        ship       0.00      0.00      0.00         0
       truck       0.00      0.00      0.00         0

    accuracy                           0.98     10000
   macro avg       0.10      0.10      0.10     10000
weighted avg       1.00      0.98      0.99     10000

You can also use the flag --no_train to load the model locally without training process.

$ python main.py --dataset cifar10 --no_train  # load model file locally.

More parameters are allowed to set, run python main.py -h to see detail.

$ python main.py -h
usage: main.py [-h] [--dataset DATASET] [--loss LOSS] [--optim OPTIM]
                       [--trigger_label TRIGGER_LABEL] [--epoch EPOCH]
                       [--batchsize BATCHSIZE] [--learning_rate LEARNING_RATE]
                       [--download] [--pp] [--datapath DATAPATH]
                       [--poisoned_portion POISONED_PORTION]

Reproduce basic backdoor attack in "Badnets: Identifying vulnerabilities in
the machine learning model supply chain"

optional arguments:
  -h, --help            show this help message and exit
  --dataset DATASET     Which dataset to use (mnist or cifar10, default:
                        mnist)
  --loss LOSS           Which loss function to use (mse or cross, default:
                        mse)
  --optim OPTIM         Which optimizer to use (sgd or adam, default: sgd)
  --trigger_label TRIGGER_LABEL
                        The NO. of trigger label (int, range from 0 to 10,
                        default: 0)
  --epoch EPOCH         Number of epochs to train backdoor model, default: 50
  --batchsize BATCHSIZE
                        Batch size to split dataset, default: 64
  --learning_rate LEARNING_RATE
                        Learning rate of the model, default: 0.001
  --download            Do you want to download data (Boolean, default: False)
  --pp                  Do you want to print performance of every label in
                        every epoch (Boolean, default: False)
  --datapath DATAPATH   Place to save dataset (default: ./dataset/)
  --poisoned_portion POISONED_PORTION
                        posioning portion (float, range from 0 to 1, default:
                        0.1)

Structure

.
├── checkpoints/   # save models.
├── data/          # store definitions and funtions to handle data.
├── dataset/       # save datasets.
├── logs/          # save run logs.
├── models/        # store definitions and functions of models
├── utils/         # general tools.
├── LICENSE
├── README.md
├── main.py   # main file of badnets.
├── deeplearning.py   # model training funtions
└── requirements.txt

Contributing

PRs accepted.

License

MIT © Vera

Owner
Vera
Security Researcher/Sci-fi Author
Vera
A toy compiler that can convert Python scripts to pickle bytecode 🥒

Pickora 🐰 A small compiler that can convert Python scripts to pickle bytecode. Requirements Python 3.8+ No third-party modules are required. Usage us

ꌗᖘ꒒ꀤ꓄꒒ꀤꈤꍟ 68 Jan 04, 2023
JAX + dataclasses

jax_dataclasses jax_dataclasses provides a wrapper around dataclasses.dataclass for use in JAX, which enables automatic support for: Pytree registrati

Brent Yi 35 Dec 21, 2022
magiCARP: Contrastive Authoring+Reviewing Pretraining

magiCARP: Contrastive Authoring+Reviewing Pretraining Welcome to the magiCARP API, the test bed used by EleutherAI for performing text/text bi-encoder

EleutherAI 43 Dec 29, 2022
Fully Convolutional DenseNet (A.K.A 100 layer tiramisu) for semantic segmentation of images implemented in TensorFlow.

FC-DenseNet-Tensorflow This is a re-implementation of the 100 layer tiramisu, technically a fully convolutional DenseNet, in TensorFlow (Tiramisu). Th

Hasnain Raza 121 Oct 12, 2022
LightNet++: Boosted Light-weighted Networks for Real-time Semantic Segmentation

LightNet++ !!!New Repo.!!! ⇒ EfficientNet.PyTorch: Concise, Modular, Human-friendly PyTorch implementation of EfficientNet with Pre-trained Weights !!

linksense 237 Jan 05, 2023
Numerai tournament example scripts using NN and optuna

numerai_NN_example Numerai tournament example scripts using pytorch NN, lightGBM and optuna https://numer.ai/tournament Performance of my model based

Takahiro Maeda 12 Oct 10, 2022
Air Quality Prediction Using LSTM

AirQualityPredictionUsingLSTM In this Repo, i present to you the winning solution of smart gujarat hackathon 2019 where the task was to predict the qu

Deepak Nandwani 2 Dec 13, 2022
Editing a classifier by rewriting its prediction rules

This repository contains the code and data for our paper: Editing a classifier by rewriting its prediction rules Shibani Santurkar*, Dimitris Tsipras*

Madry Lab 86 Dec 27, 2022
Python codes for Lite Audio-Visual Speech Enhancement.

Lite Audio-Visual Speech Enhancement (Interspeech 2020) Introduction This is the PyTorch implementation of Lite Audio-Visual Speech Enhancement (LAVSE

Shang-Yi Chuang 85 Dec 01, 2022
This repo is the code release of EMNLP 2021 conference paper "Connect-the-Dots: Bridging Semantics between Words and Definitions via Aligning Word Sense Inventories".

Connect-the-Dots: Bridging Semantics between Words and Definitions via Aligning Word Sense Inventories This repo is the code release of EMNLP 2021 con

12 Nov 22, 2022
Python SDK for building, training, and deploying ML models

Overview of Kubeflow Fairing Kubeflow Fairing is a Python package that streamlines the process of building, training, and deploying machine learning (

Kubeflow 325 Dec 13, 2022
THIS IS THE **OLD** PYMC PROJECT. PLEASE USE PYMC3 INSTEAD:

Introduction Version: 2.3.8 Authors: Chris Fonnesbeck Anand Patil David Huard John Salvatier Web site: https://github.com/pymc-devs/pymc Documentation

PyMC 7.2k Jan 07, 2023
Official pytorch implementation of DeformSyncNet: Deformation Transfer via Synchronized Shape Deformation Spaces

DeformSyncNet: Deformation Transfer via Synchronized Shape Deformation Spaces Minhyuk Sung*, Zhenyu Jiang*, Panos Achlioptas, Niloy J. Mitra, Leonidas

Zhenyu Jiang 21 Aug 30, 2022
Code for NeurIPS2021 submission "A Surrogate Objective Framework for Prediction+Programming with Soft Constraints"

This repository is the code for NeurIPS 2021 submission "A Surrogate Objective Framework for Prediction+Programming with Soft Constraints". Edit 2021/

10 Dec 20, 2022
PyTorch implementation of the end-to-end coreference resolution model with different higher-order inference methods.

End-to-End Coreference Resolution with Different Higher-Order Inference Methods This repository contains the implementation of the paper: Revealing th

Liyan 52 Jan 04, 2023
Fashion Landmark Estimation with HRNet

HRNet for Fashion Landmark Estimation (Modified from deep-high-resolution-net.pytorch) Introduction This code applies the HRNet (Deep High-Resolution

SVIP Lab 91 Dec 26, 2022
A simple AI that will give you si ple task and this is made with python

Crystal-AI A simple AI that will give you si ple task and this is made with python Prerequsites: Python3.6.2 pyttsx3 pip install pyttsx3 pyaudio pip i

CrystalAnd 1 Dec 25, 2021
Codes of the paper Deformable Butterfly: A Highly Structured and Sparse Linear Transform.

Deformable Butterfly: A Highly Structured and Sparse Linear Transform DeBut Advantages DeBut generalizes the square power of two butterfly factor matr

Rui LIN 8 Jun 10, 2022
An Easy-to-use, Modular and Prolongable package of deep-learning based Named Entity Recognition Models.

DeepNER An Easy-to-use, Modular and Prolongable package of deep-learning based Named Entity Recognition Models. This repository contains complex Deep

Derrick 9 May 30, 2022
Next-Best-View Estimation based on Deep Reinforcement Learning for Active Object Classification

next_best_view_rl Setup Clone the repository: git clone --recurse-submodules ... In 'third_party/zed-ros-wrapper': git checkout devel Install mujoco `

Christian Korbach 1 Feb 15, 2022