Automatic learning-rate scheduler

Overview

AutoLRS

This is the PyTorch code implementation for the paper AutoLRS: Automatic Learning-Rate Schedule by Bayesian Optimization on the Fly published at ICLR 2021.

A TensorFlow version will appear in this repo later.

What is AutoLRS?

Finding a good learning rate schedule for a DNN model is non-trivial. The goal of AutoLRS is to automatically tune the learning rate (LR) over the course of training without human involvement. AutoLRS chops up the whole training process into a few training stages (each consists of τ steps), and its mission is to determine a constant LR for each training stage. AutoLRS treats the validation loss as a black-box function of LR, and uses Bayesian optimization (BO) to search for the best LR which can minimize the validation loss for each training stage. Because BO would require τ steps of training to evaluate the validation loss for each LR it explores, to reduce this cost, we only apply an LR to train the DNN for τ’ (τ’ << τ) steps and train an exponential time-series forecasting model to predict the loss after τ steps. In our default setting, τ’ = τ/10 and BO explores 10 LRs in each stage, so the number of steps for searching LR is equal to the number of steps for actual training.

AutoLRS does not depend on a pre-defined LR schedule, dataset, or a specified task and is compatible with almost all optimizers. The LR schedules auto-generated by AutoLRS lead to speedup over highly hand-tuned LR schedules for several state-of-the-art DNNs including ResNet-50, Transformer, and BERT.

Setup

$ pip install --user -r requirements.txt

How to use AutoLRS for your work?

autolrs_server.py is the brain of AutoLRS, which implements the search algorithm including BO and the exponential forecasting model.

autolrs_callback.py implements a callback which you can plug into your Pytorch training loop. The callback receives commands from the server via socket, adjusting the learning rate, saving/restoring model parameters and optimizer states according to commands sent from the server.

Notes

  • You need to pass two arguments min_lr and max_lr when launching autolrs_server.py to set the LR search interval. This interval can be found by an LR range test or simply set according to your experience. Do not set the min_lr too small (for example 1e-10), otherwise, BO will waste a lot of cycles to try exploring very small LR values.
  • The current AutoLRS does not search LR for warmup steps since warmup does not have an explicit optimization objective, such as minimizing the validation loss. Warmup usually takes very few steps, and its main purpose is to prevent deeper layers in a DNN from creating training instability, especially when training using a large batch size. You can manually add a warmup stage by setting warmup_step and warmup_lr when initializing the autolrs_callback.AutoLRS callback.

Example

We provide an example of using AutoLRS to train various DNNs on the CIFAR-10 dataset. The models are imported from kuangliu's great and simple pytorch-cifar repository.

Prerequisites: Python 3.6+, PyTorch 1.0+

Run the example

$ bash run.sh

Contact

You can contact us at [email protected]. We would love to hear your questions and feedback!

Poster

Owner
Yuchen Jin
Yuchen Jin
Generalized and Efficient Blackbox Optimization System.

OpenBox Doc | OpenBox中文文档 OpenBox: Generalized and Efficient Blackbox Optimization System OpenBox is an efficient and generalized blackbox optimizatio

DAIR Lab 238 Dec 29, 2022
A certifiable defense against adversarial examples by training neural networks to be provably robust

DiffAI v3 DiffAI is a system for training neural networks to be provably robust and for proving that they are robust. The system was developed for the

SRI Lab, ETH Zurich 202 Dec 13, 2022
Code for CVPR2021 paper "Robust Reflection Removal with Reflection-free Flash-only Cues"

Robust Reflection Removal with Reflection-free Flash-only Cues (RFC) Paper | To be released: Project Page | Video | Data Tensorflow implementation for

Chenyang LEI 162 Jan 05, 2023
A PyTorch implementation of the Relational Graph Convolutional Network (RGCN).

Torch-RGCN Torch-RGCN is a PyTorch implementation of the RGCN, originally proposed by Schlichtkrull et al. in Modeling Relational Data with Graph Conv

Thiviyan Singam 66 Nov 30, 2022
[NeurIPS-2021] Mosaicking to Distill: Knowledge Distillation from Out-of-Domain Data

MosaicKD Code for NeurIPS-21 paper "Mosaicking to Distill: Knowledge Distillation from Out-of-Domain Data" 1. Motivation Natural images share common l

ZJU-VIPA 37 Nov 10, 2022
HMLLDB is a collection of LLDB commands to assist in the debugging of iOS apps.

HMLLDB is a collection of LLDB commands to assist in the debugging of iOS apps. 中文介绍 Features Non-intrusive. Your iOS project does not need to be modi

mao2020 47 Oct 22, 2022
Milano is a tool for automating hyper-parameters search for your models on a backend of your choice.

Milano (This is a research project, not an official NVIDIA product.) Documentation https://nvidia.github.io/Milano Milano (Machine learning autotuner

NVIDIA Corporation 147 Dec 17, 2022
EvoJAX is a scalable, general purpose, hardware-accelerated neuroevolution toolkit

EvoJAX: Hardware-Accelerated Neuroevolution EvoJAX is a scalable, general purpose, hardware-accelerated neuroevolution toolkit. Built on top of the JA

Google 598 Jan 07, 2023
Sample Code for "Pessimism Meets Invariance: Provably Efficient Offline Mean-Field Multi-Agent RL"

Sample Code for "Pessimism Meets Invariance: Provably Efficient Offline Mean-Field Multi-Agent RL" This is the official codebase for Pessimism Meets I

3 Sep 19, 2022
My implementation of DeepMind's Perceiver

DeepMind Perceiver (in PyTorch) Disclaimer: This is not official and I'm not affiliated with DeepMind. My implementation of the Perceiver: General Per

Louis Arge 55 Dec 12, 2022
A package to predict protein inter-residue geometries from sequence data

trRosetta This package is a part of trRosetta protein structure prediction protocol developed in: Improved protein structure prediction using predicte

Ivan Anishchenko 185 Jan 07, 2023
MetaTTE: a Meta-Learning Based Travel Time Estimation Model for Multi-city Scenarios

MetaTTE: a Meta-Learning Based Travel Time Estimation Model for Multi-city Scenarios This is the official TensorFlow implementation of MetaTTE in the

morningstarwang 4 Dec 14, 2022
VGGVox models for Speaker Identification and Verification trained on the VoxCeleb (1 & 2) datasets

VGGVox models for speaker identification and verification This directory contains code to import and evaluate the speaker identification and verificat

338 Dec 27, 2022
Code Repository for The Kaggle Book, Published by Packt Publishing

The Kaggle Book Data analysis and machine learning for competitive data science Code Repository for The Kaggle Book, Published by Packt Publishing "Lu

Packt 1.6k Jan 07, 2023
DIVeR: Deterministic Integration for Volume Rendering

DIVeR: Deterministic Integration for Volume Rendering This repo contains the training and evaluation code for DIVeR. Setup python 3.8 pytorch 1.9.0 py

64 Dec 27, 2022
RepVGG: Making VGG-style ConvNets Great Again

RepVGG: Making VGG-style ConvNets Great Again (PyTorch) This is a super simple ConvNet architecture that achieves over 80% top-1 accuracy on ImageNet

2.8k Jan 04, 2023
Download and preprocess popular sequential recommendation datasets

Sequential Recommendation Datasets This repository collects some commonly used sequential recommendation datasets in recent research papers and provid

125 Dec 06, 2022
DSL for matching Python ASTs

py-ast-rule-engine This library provides a DSL (domain-specific language) to match a pattern inside a Python AST (abstract syntax tree). The library i

1 Dec 18, 2021
Code for the paper "Jukebox: A Generative Model for Music"

Status: Archive (code is provided as-is, no updates expected) Jukebox Code for "Jukebox: A Generative Model for Music" Paper Blog Explorer Colab Insta

OpenAI 6k Jan 02, 2023
The aim of this project is to build an AI bot that can play the Wordle game, or more generally Squabble

Wordle RL The aim of this project is to build an AI bot that can play the Wordle game, or more generally Squabble I know there are more deterministic

Aditya Arora 3 Feb 22, 2022