Public Code for NIPS submission SimiGrad: Fine-Grained Adaptive Batching for Large ScaleTraining using Gradient Similarity Measurement

Related tags

Deep LearningSimiGrad
Overview

Public code for NIPS submission "SimiGrad: Fine-Grained Adaptive Batching for Large Scale Training using Gradient Similarity Measurement"

This repo contains both our SimiGrad framework (integrated with DeepSpeed) and all training codes used to generate the results in the paper.

Installation

Please use ./DeepSpeed/install.sh to install our SimiGrad framework. For detailed installation options please see ./DeepSpeed/install.sh . It is recommended that you use a virtual environment to install SimiGrad.

Usage

To use SimiGrad, simply add an additional parameter adaptive_batch_params when initializing DeepSpeed. For example,

model, optimizer, _, _ = deepspeed.initialize(
        args=...,
        model=...,
        model_parameters=...,
        adaptive_batch_params={
            "enable_adjust": args.similarity_target, # bool, set to `True` to use adaptive batch size and `False` for fixed batch size
            "verbose": True, # bool, set to `True` to print details of batch size adjustment
            "similarity_target":args.similarity_target, # float, -1.0~1.0, the similarity target that controls how aggressive the batch size adjustment is.
            "batch_size_lower_bound":args.batchsize_lower_bound, # int, optional, the lower bound of batch size. Recommended only if you have a well-tuned warmup learning rate scheduling.
            "batch_size_upper_bound":args.batchsize_upper_bound, # int, optional, the upper bound of batch size.
            "max_micro_batch_size":args.max_micro_batch_size, # int, optional, the upper bound of micro batch size to prevent out-of-memory error. If unspecified, the initial micro batch size will be used as the max_micro_batch_size.})

Please refer to our code (e.g. DeepSpeedExamples/pytorch-cifar/main.py) for details such as how to read the metrics from the framework.

For usage of DeepSpeed, please refer to their website https://www.deepspeed.ai/

Reproduce Paper's Results

The parameters we used to get the claimed results are included in the paper.

BERT Large Pretrain

All scripts can be found in DeepSpeedExamples/bert_pretrain/. Please use the script ds_train_bert_bsz64k_seq128.sh for BERT Large pretrain with sequence length 128 (epoch 1-150). You need to specify the parameters like similarity_target and also the location of the WikiandBookCorpus dataset in the script.

After the sequence length 128 pretrain, use ds_train_bert_bsz32k_seq512.sh to finish the sequence length 512 part of pretrain (epoch 151-170). You need to specify the checkpoint from sequence length 128 pretrain for the sequence length 512 to start with. Then the BERT Large model is ready for downstream tasks.

SQuAD Score from BERT Large Pretrain

After the BERT pretrain, use DeepSpeedExamples/BingBertSquad/run_squad_deepspeed.sh to get the SQuAD 1.1 score. You need to specify the checkpoint from sequence length 512 pretrain and the location of SQuAD 1.1 dataset.

ResNet18 on CIFAR10

All scripts can be found in DeepSpeedExamples/pytorch-cifar/. Use the script run.sh to train ResNet18 with specific parameters. Use the grid_search.py and baseline_grid_search.py to get the Pareto results of test acc vs. batch size in the paper.

ResNet50 on ImageNet

All scripts can be found in DeepSpeedExamples/imagenet_deepspeed/. Use the script run_with2kmin.sh to train ResNet50 with spcific parameters.

Future of SimiGrad

SimiGrad will be officially integrated as part of DeepSpeed soon!

Owner
Heyang Qin
Heyang Qin
Source code for Fathony, Sahu, Willmott, & Kolter, "Multiplicative Filter Networks", ICLR 2021.

Multiplicative Filter Networks This repository contains a PyTorch MFN implementation and code to perform & reproduce experiments from the ICLR 2021 pa

Bosch Research 66 Jan 04, 2023
Code Impementation for "Mold into a Graph: Efficient Bayesian Optimization over Mixed Spaces"

Code Impementation for "Mold into a Graph: Efficient Bayesian Optimization over Mixed Spaces" This repo contains the implementation of GEBO algorithm.

Jaeyeon Ahn 2 Mar 22, 2022
This source code is implemented using keras library based on "Automatic ocular artifacts removal in EEG using deep learning"

CSP_Deep_EEG This source code is implemented using keras library based on "Automatic ocular artifacts removal in EEG using deep learning" {https://www

Seyed Mahdi Roostaiyan 2 Nov 08, 2022
Learning-based agent for Google Research Football

TiKick 1.Introduction Learning-based agent for Google Research Football Code accompanying the paper "TiKick: Towards Playing Multi-agent Football Full

Tsinghua AI Research Team for Reinforcement Learning 90 Dec 26, 2022
A simple root calculater for python

Root A simple root calculater Usage/Examples python3 root.py 9 3 4 # Order: number - grid - number of decimals # Output: 2.08

Reza Hosseinzadeh 5 Feb 10, 2022
NuPIC Studio is an all­-in-­one tool that allows users create a HTM neural network from scratch

NuPIC Studio is an all­-in-­one tool that allows users create a HTM neural network from scratch, train it, collect statistics, and share it among the members of the community. It is not just a visual

HTM Community 93 Sep 30, 2022
Base pretrained models and datasets in pytorch (MNIST, SVHN, CIFAR10, CIFAR100, STL10, AlexNet, VGG16, VGG19, ResNet, Inception, SqueezeNet)

This is a playground for pytorch beginners, which contains predefined models on popular dataset. Currently we support mnist, svhn cifar10, cifar100 st

Aaron Chen 2.4k Dec 28, 2022
PHOTONAI is a high level python API for designing and optimizing machine learning pipelines.

PHOTONAI is a high level python API for designing and optimizing machine learning pipelines. We've created a system in which you can easily select and

Medical Machine Learning Lab - University of Münster 57 Nov 12, 2022
A forwarding MPI implementation that can use any other MPI implementation via an MPI ABI

MPItrampoline MPI wrapper library: MPI trampoline library: MPI integration tests: MPI is the de-facto standard for inter-node communication on HPC sys

Erik Schnetter 31 Dec 22, 2022
A Human-in-the-Loop workflow for creating HD images from text

A Human-in-the-Loop? workflow for creating HD images from text DALL·E Flow is an interactive workflow for generating high-definition images from text

Jina AI 2.5k Jan 02, 2023
Learning Synthetic Environments and Reward Networks for Reinforcement Learning

Learning Synthetic Environments and Reward Networks for Reinforcement Learning We explore meta-learning agent-agnostic neural Synthetic Environments (

AutoML-Freiburg-Hannover 16 Sep 02, 2022
Multi-Stage Episodic Control for Strategic Exploration in Text Games

XTX: eXploit - Then - eXplore Requirements First clone this repo using git clone https://github.com/princeton-nlp/XTX.git Please create two conda envi

Princeton Natural Language Processing 9 May 24, 2022
Bald-to-Hairy Translation Using CycleGAN

GANiry: Bald-to-Hairy Translation Using CycleGAN Official PyTorch implementation of GANiry. GANiry: Bald-to-Hairy Translation Using CycleGAN, Fidan Sa

Fidan Samet 10 Oct 27, 2022
CLIP: Connecting Text and Image (Learning Transferable Visual Models From Natural Language Supervision)

CLIP (Contrastive Language–Image Pre-training) Experiments (Evaluation) Model Dataset Acc (%) ViT-B/32 (Paper) CIFAR100 65.1 ViT-B/32 (Our) CIFAR100 6

Myeongjun Kim 52 Jan 07, 2023
3rd place solution for the Weather4cast 2021 Stage 1 Challenge

weather4cast2021_Stage1 3rd place solution for the Weather4cast 2021 Stage 1 Challenge Dependencies The code can be executed from a fresh environment

5 Aug 14, 2022
[NeurIPS'21] Shape As Points: A Differentiable Poisson Solver

Shape As Points (SAP) Paper | Project Page | Short Video (6 min) | Long Video (12 min) This repository contains the implementation of the paper: Shape

394 Dec 30, 2022
ShapeGlot: Learning Language for Shape Differentiation

ShapeGlot: Learning Language for Shape Differentiation Created by Panos Achlioptas, Judy Fan, Robert X.D. Hawkins, Noah D. Goodman, Leonidas J. Guibas

Panos 32 Dec 23, 2022
Concept drift monitoring for HA model servers.

{Fast, Correct, Simple} - pick three Easily compare training and production ML data & model distributions Goals Boxkite is an instrumentation library

98 Dec 15, 2022
Tracking Pipeline helps you to solve the tracking problem more easily

Tracking_Pipeline Tracking_Pipeline helps you to solve the tracking problem more easily I integrate detection algorithms like: Yolov5, Yolov4, YoloX,

VNOpenAI 32 Dec 21, 2022
State of the Art Neural Networks for Deep Learning

pyradox This python library helps you with implementing various state of the art neural networks in a totally customizable fashion using Tensorflow 2

Ritvik Rastogi 60 May 29, 2022