Neural Ensemble Search for Performant and Calibrated Predictions

Related tags

Deep Learningnes
Overview

Neural Ensemble Search

Introduction

This repo contains the code accompanying the paper:

Neural Ensemble Search for Performant and Calibrated Predictions

Authors: Sheheryar Zaidi*, Arber Zela*, Thomas Elsken, Chris Holmes, Frank Hutter and Yee Whye Teh.

The paper introduces two NES algorithms for finding ensembles with varying baselearner architectures with the aim of producing performant and calibrated predictions for both in-distribution data and during distributional shift.

The code, as provided here, makes use of the SLURM job scheduler, however, one should be able to make changes to run the code without SLURM.

News: Oral presentation at the Uncertainty & Robustness in Deep Learning (UDL) Workshop @ ICML 2020

Setting up virtual environment

First, clone and cd to the root of repo:

git clone https://github.com/automl/nes.git
cd nes

We used Python 3.6 and PyTorch 1.3.1 with CUDA 10.0 (see requirements.txt) for running our experiments. For reproducibility, we recommend using these python and CUDA versions. To set up the virtual environment execute the following (python points to Python 3.6):

python -m venv venv

Then, activate the environment using:

source venv/bin/activate

Now install requirements.txt packages by:

pip install -r requirements.txt -f https://download.pytorch.org/whl/torch_stable.html

Generating the CIFAR-10-C dataset

To run the experiments on CIFAR-10-C (Hendrycks and Dietterich, ICLR 2019), first generate the shifted data. Begin by downloading the CIFAR-10 dataset by executing the following command:

python -c "import torchvision.datasets as dset; dset.CIFAR10(\"data\", train=True, download=True)"

Next, run the cluster_scripts/generate_corrupted.sh script to generate the shifted data using the command:

sbatch -p $GPU_CLUSTER_PARTITION cluster_scripts/generate_corrupted.sh

$GPU_CLUSTER_PARTITION is the name of the cluster partition you want to submit the array job to.

To run this without SLURM, use the following command which runs sequentially rather than in parallel:

for i in 0..18; do PYTHONPATH=$PWD python data/generate_corrupted.py $i; done

Running the experiments

The structure for running the two Neural Ensemble Search (NES) algorithms, NES-RS and NES-RE consists of three steps: train the base learners, apply ensemble selection and evaluate the final ensembles. We compared to three deep ensemble baselines: DeepEns (RS), DeepEns (DARTS) and DeepEns(AmoebaNet). The latter two simply require training the baselearners and evaluating the ensemble. For DeepEns (RS), we require an extra intermediate step of picking the "incumbent" architecture (i.e. best architecture by validation loss) from a randomly sampled pool of architectures. For a fair and efficient comparison, we use the same randomly sampled (and trained) pool of architectures used by NES-RS.

Running NES

We describe how to run NES algorithms for CIFAR-10-C using the scripts in cluster_scripts/cifar10/; for Fashion-MNIST, proceed similarly but using the scripts in cluster_scripts/fmnist/. For NES algorithms, we first train the base learners in parallel by the commands:

sbatch -p $GPU_CLUSTER_PARTITION cluster_scripts/cifar10/sbatch_scripts/nes_rs.sh (NES-RS)

and

sbatch -p $GPU_CLUSTER_PARTITION cluster_scripts/cifar10/sbatch_scripts/nes_re.sh (NES-RE)

These scripts will run the NES search for 400 iterations using the same hyperparameters as described in the paper to build the pools of base learners. All baselearners (trained network parameters, predictions across all severity levels, etc.) will be saved in experiments/cifar10/baselearners/ (experiments/fmnist/baselearners/ for Fashion-MNIST).

Next, we perform ensemble selection given the pools built by NES-RS and NES-RE using the command:

sbatch -p $GPU_CLUSTER_PARTITION cluster_scripts/cifar10/sbatch_scripts/ensembles_from_pools.sh

We will return to the final step of ensemble evaluation.

Running Deep Ensemble Baselines

To run the deep ensemble baselines DeepEns (AmoebaNet) and DeepEns (DARTS), we first train the base learners in parallel using the scripts:

sbatch -p $GPU_CLUSTER_PARTITION cluster_scripts/cifar10/sbatch_scripts/deepens_amoeba.sh (DeepEns-AmoebaNet)

and

sbatch -p $GPU_CLUSTER_PARTITION cluster_scripts/cifar10/sbatch_scripts/deepens_darts.sh (DeepEns-DARTS)

These will train the DARTS and AmoebaNet architectures with different random initializations and save the results again in experiments/cifar10/baselearners/.

To run DeepEns-RS, we first have to extract the incumbent architectures from the random sample produced by the NES-RS run above. For that, run:

sbatch -p $GPU_CLUSTER_PARTITION cluster_scripts/cifar10/sbatch_scripts/get_incumbents_rs.sh

which saves incumbent architecture ids in experiments/cifar10/outputs/deepens_rs/incumbents.txt. Then run the following loop to train multiple random initializations of each of the incumbent architectures:

for arch_id in $(cat < experiments/cifar10/outputs/deepens_rs/incumbents.txt); do sbatch -p $GPU_CLUSTER_PARTITION cluster_scripts/cifar10/sbatch_scripts/deepens_rs.sh $arch_id; done

Evaluating the Ensembles

When all the runs above are complete, all base learners are trained, and we can evaluate all the ensembles (on in-distribution and shifted data). To do that, run the command:

sbatch -p $GPU_CLUSTER_PARTITION cluster_scripts/cifar10/sbatch_scripts/evaluate_ensembles.sh

Plotting the results

Finally, after all the aforementioned steps have been completed, we plot the results by running:

bash cluster_scripts/cifar10/plot_data.sh

This will save the plots in experiments/cifar10/outputs/plots.

Figures from the paper

Results on Fashion-MNIST: Loss fmnist

NES with Regularized Evolution: nes-re

For more details, please refer to the original paper.

Citation

@article{zaidi20,
  author  = {Sheheryar Zaidi and Arber Zela and Thomas Elsken and Chris Holmes and Frank Hutter and Yee Whye Teh},
  title   = {{Neural} {Ensemble} {Search} for {Performant} and {Calibrated} {Predictions}},
  journal = {arXiv:2006.08573 {cs.LG}},
  year    = {2020},
  month   = jun,
}
Owner
AutoML-Freiburg-Hannover
AutoML-Freiburg-Hannover
Auxiliary Raw Net (ARawNet) is a ASVSpoof detection model taking both raw waveform and handcrafted features as inputs, to balance the trade-off between performance and model complexity.

Overview This repository is an implementation of the Auxiliary Raw Net (ARawNet), which is ASVSpoof detection system taking both raw waveform and hand

6 Jul 08, 2022
A Lightweight Face Recognition and Facial Attribute Analysis (Age, Gender, Emotion and Race) Library for Python

deepface Deepface is a lightweight face recognition and facial attribute analysis (age, gender, emotion and race) framework for python. It is a hybrid

Sefik Ilkin Serengil 5.2k Jan 02, 2023
Model Agnostic Interpretability for Multiple Instance Learning

MIL Model Agnostic Interpretability This repo contains the code for "Model Agnostic Interpretability for Multiple Instance Learning". Overview Executa

Joe Early 10 Dec 17, 2022
This repository contains the source code of our work on designing efficient CNNs for computer vision

Efficient networks for Computer Vision This repo contains source code of our work on designing efficient networks for different computer vision tasks:

Sachin Mehta 386 Nov 26, 2022
PyTorch implementation of "Supervised Contrastive Learning" (and SimCLR incidentally)

PyTorch implementation of "Supervised Contrastive Learning" (and SimCLR incidentally)

Yonglong Tian 2.2k Jan 08, 2023
Pixel-Perfect Structure-from-Motion with Featuremetric Refinement (ICCV 2021, Oral)

Pixel-Perfect Structure-from-Motion (ICCV 2021 Oral) We introduce a framework that improves the accuracy of Structure-from-Motion by refining keypoint

Computer Vision and Geometry Lab 831 Dec 29, 2022
Robust Partial Matching for Person Search in the Wild

APNet for Person Search Introduction This is the code of Robust Partial Matching for Person Search in the Wild accepted in CVPR2020. The Align-to-Part

Yingji Zhong 36 Dec 18, 2022
A model which classifies reviews as positive or negative.

SentiMent Analysis In this project I built a model to classify movie reviews fromn the IMDB dataset of 50K reviews. WordtoVec : Neural networks only w

Rishabh Bali 2 Feb 09, 2022
Repository For Programmers Seeking a platform to show their skills

Programming-Nerds Repository For Programmers Seeking Pull Requests In hacktoberfest ❓ What's Hacktoberfest 2021? Hacktoberfest is the easiest way to g

42 Oct 29, 2022
Provide partial dates and retain the date precision through processing

Prefix date parser This is a helper class to parse dates with varied degrees of precision. For example, a data source might state a date as 2001, 2001

Friedrich Lindenberg 13 Dec 14, 2022
.NET bindings for the Pytorch engine

TorchSharp TorchSharp is a .NET library that provides access to the library that powers PyTorch. It is a work in progress, but already provides a .NET

Matteo Interlandi 17 Aug 30, 2021
Official Pytorch implementation for "End2End Occluded Face Recognition by Masking Corrupted Features, TPAMI 2021"

End2End Occluded Face Recognition by Masking Corrupted Features This is the Pytorch implementation of our TPAMI 2021 paper End2End Occluded Face Recog

Haibo Qiu 25 Oct 31, 2022
Source code of the paper PatchGraph: In-hand tactile tracking with learned surface normals.

PatchGraph This repository contains the source code of the paper PatchGraph: In-hand tactile tracking with learned surface normals. Installation Creat

Paloma Sodhi 11 Dec 15, 2022
Wider-Yolo Kütüphanesi ile Yüz Tespit Uygulamanı Yap

WIDER-YOLO : Yüz Tespit Uygulaması Yap Wider-Yolo Kütüphanesinin Kullanımı 1. Wider Face Veri Setini İndir Train Dataset Val Dataset Test Dataset Not:

Kadir Nar 6 Aug 22, 2022
Multi-Target Adversarial Frameworks for Domain Adaptation in Semantic Segmentation

Multi-Target Adversarial Frameworks for Domain Adaptation in Semantic Segmentation Paper Multi-Target Adversarial Frameworks for Domain Adaptation in

Valeo.ai 20 Jun 21, 2022
RMTD: Robust Moving Target Defence Against False Data Injection Attacks in Power Grids

RMTD: Robust Moving Target Defence Against False Data Injection Attacks in Power Grids Real-time detection performance. This repo contains the code an

0 Nov 10, 2021
(CVPR 2022) A minimalistic mapless end-to-end stack for joint perception, prediction, planning and control for self driving.

LAV Learning from All Vehicles Dian Chen, Philipp Krähenbühl CVPR 2022 (also arXiV 2203.11934) This repo contains code for paper Learning from all veh

Dian Chen 300 Dec 15, 2022
Level Based Customer Segmentation

level_based_customer_segmentation Level Based Customer Segmentation Persona Veri Seti kullanılarak müşteri segmentasyonu yapılmıştır. KOLONLAR : PRICE

Buse Yıldırım 6 Dec 21, 2021
Deep Q-network learning to play flappybird.

AI Plays Flappy Bird I've trained a DQN that learns to play flappy bird on it's own. Try the pre-trained model First install the pip requirements and

Anish Shrestha 3 Mar 01, 2022
Reinforcement learning library in JAX.

Reinforcement learning library in JAX.

Yicheng Luo 96 Oct 30, 2022