Ranger - a synergistic optimizer using RAdam (Rectified Adam), Gradient Centralization and LookAhead in one codebase

Overview

Ranger-Deep-Learning-Optimizer


Ranger - a synergistic optimizer combining RAdam (Rectified Adam) and LookAhead, and now GC (gradient centralization) in one optimizer.

quick note - Ranger21 is now in beta and is Ranger with a host of new improvements.

Recommend you compare results with Ranger21: https://github.com/lessw2020/Ranger21

Latest version 20.9.4 - updates Gradient Centralization to GC2 (thanks to GC developer) and removes addcmul_ deprecation warnings in PyTorch 1.60.



*Latest version is in ranger2020.py - looking at a few other additions before integrating into the main ranger.py.

What is Gradient Centralization? = "GC can be viewed as a projected gradient descent method with a constrained loss function. The Lipschitzness of the constrained loss function and its gradient is better so that the training process becomes more efficient and stable." Source paper: https://arxiv.org/abs/2004.01461v2
Ranger now uses Gradient Centralization by default, and applies it to all conv and fc layers by default. However, everything is customizable so you can test with and without on your own datasets. (Turn on off via "use_gc" flag at init).

Best training results - use a 75% flat lr, then step down and run lower lr for 25%, or cosine descend last 25%.


Per extensive testing - It's important to note that simply running one learning rate the entire time will not produce optimal results.
Effectively Ranger will end up 'hovering' around the optimal zone, but can't descend into it unless it has some additional run time at a lower rate to drop down into the optimal valley.

Full customization at init:


Ranger will now print out id and gc settings at init so you can confirm the optimizer settings at train time:

/////////////////////

Medium article with more info:
https://medium.com/@lessw/new-deep-learning-optimizer-ranger-synergistic-combination-of-radam-lookahead-for-the-best-of-2dc83f79a48d

Multiple updates: 1 - Ranger is the optimizer we used to beat the high scores for 12 different categories on the FastAI leaderboards! (Previous records all held with AdamW optimizer).

2 - Highly recommend combining Ranger with: Mish activation function, and flat+ cosine anneal training curve.

3 - Based on that, also found .95 is better than .90 for beta1 (momentum) param (ala betas=(0.95, 0.999)).

Fixes: 1 - Differential Group learning rates now supported. This was fix in RAdam and ported here thanks to @sholderbach. 2 - save and then load may leave first run weights stranded in memory, slowing down future runs = fixed.

Installation

Clone the repo, cd into it and install it in editable mode (-e option). That way, these is no more need to re-install the package after modification.

git clone https://github.com/lessw2020/Ranger-Deep-Learning-Optimizer
cd Ranger-Deep-Learning-Optimizer
pip install -e . 

Usage

from ranger import Ranger  # this is from ranger.py
from ranger import RangerVA  # this is from ranger913A.py
from ranger import RangerQH  # this is from rangerqh.py

# Define your model
model = ...
# Each of the Ranger, RangerVA, RangerQH have different parameters.
optimizer = Ranger(model.parameters(), **kwargs)

Usage and notebook to test are available here: https://github.com/lessw2020/Ranger-Mish-ImageWoof-5

Citing this work

We recommend you use the following to cite Ranger in your publications:

@misc{Ranger,
  author = {Wright, Less},
  title = {Ranger - a synergistic optimizer.},
  year = {2019},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/lessw2020/Ranger-Deep-Learning-Optimizer}}
}
Owner
Less Wright
Principal Software Engineer at Audere PM/Test/Dev at Microsoft Software Architect at X10 Wireless
Less Wright
Computational Pathology Toolbox developed by TIA Centre, University of Warwick.

TIA Toolbox Computational Pathology Toolbox developed at the TIA Centre Getting Started All Users This package is for those interested in digital path

Tissue Image Analytics (TIA) Centre 156 Jan 08, 2023
Space-invaders - Simple Game created using Python & PyGame, as my Beginner Python Project

Space Invaders This is a simple SPACE INVADER game create using PYGAME whihc hav

Gaurav Pandey 2 Jan 08, 2022
Official repo for BMVC2021 paper ASFormer: Transformer for Action Segmentation

ASFormer: Transformer for Action Segmentation This repo provides training & inference code for BMVC 2021 paper: ASFormer: Transformer for Action Segme

42 Dec 23, 2022
DECAF: Deep Extreme Classification with Label Features

DECAF DECAF: Deep Extreme Classification with Label Features @InProceedings{Mittal21, author = "Mittal, A. and Dahiya, K. and Agrawal, S. and Sain

46 Nov 06, 2022
Official implementation of "Can You Spot the Chameleon? Adversarially Camouflaging Images from Co-Salient Object Detection" in CVPR 2022.

Jadena Official implementation of "Can You Spot the Chameleon? Adversarially Camouflaging Images from Co-Salient Object Detection" in CVPR 2022. arXiv

Qing Guo 13 Nov 29, 2022
Matplotlib Image labeller for classifying images

mpl-image-labeller Use Matplotlib to label images for classification. Works anywhere Matplotlib does - from the notebook to a standalone gui! For more

Ian Hunt-Isaak 5 Sep 24, 2022
A Joint Video and Image Encoder for End-to-End Retrieval

Frozen️ in Time ❄️ ️️️️ ⏳ A Joint Video and Image Encoder for End-to-End Retrieval project page | arXiv | webvid-data Repository containing the code,

225 Dec 25, 2022
A hifiasm fork for metagenome assembly using Hifi reads.

hifiasm_meta - de novo metagenome assembler, based on hifiasm, a haplotype-resolved de novo assembler for PacBio Hifi reads.

44 Jul 10, 2022
A framework to train language models to learn invariant representations.

Invariant Language Modeling Implementation of the training for invariant language models. Motivation Modern pretrained language models are critical co

6 Nov 16, 2022
We will see a basic program that is basically a hint to brute force attack to crack passwords. In other words, we will make a program to Crack Any Password Using Python. Show some ❤️ by starring this repository!

Crack Any Password Using Python We will see a basic program that is basically a hint to brute force attack to crack passwords. In other words, we will

Ananya Chatterjee 11 Dec 03, 2022
3D-Reconstruction 基于深度学习方法的单目多视图三维重建

基于深度学习方法的单目多视图三维重建 Part I 三维重建 代码:Part1 技术文档:[Markdown] [PDF] 原始图像:Original Images 点云结果:Point Cloud Results-1

HMT_Curo 19 Dec 26, 2022
A Large Scale Benchmark for Individual Treatment Effect Prediction and Uplift Modeling

large-scale-ITE-UM-benchmark This repository contains code and data to reproduce the results of the paper "A Large Scale Benchmark for Individual Trea

10 Nov 19, 2022
Repository containing detailed experiments related to the paper "Memotion Analysis through the Lens of Joint Embedding".

Memotion Analysis Through The Lens Of Joint Embedding This repository contains the experiments conducted as described in the paper 'Memotion Analysis

Nethra Gunti 1 Mar 16, 2022
a minimal terminal with python 😎😉

Meterm a terminal with python 😎 How to use Clone Project: $ git clone https://github.com/motahharm/meterm.git Run: in Terminal: meterm.exe Or pip ins

Motahhar.Mokfi 5 Jan 28, 2022
[CVPR 2021] Involution: Inverting the Inherence of Convolution for Visual Recognition, a brand new neural operator

involution Official implementation of a neural operator as described in Involution: Inverting the Inherence of Convolution for Visual Recognition (CVP

Duo Li 1.3k Dec 28, 2022
(ICCV'21) Official PyTorch implementation of Relational Embedding for Few-Shot Classification

Relational Embedding for Few-Shot Classification (ICCV 2021) Dahyun Kang, Heeseung Kwon, Juhong Min, Minsu Cho [paper], [project hompage] We propose t

Dahyun Kang 82 Dec 24, 2022
Code associated with the paper "Towards Understanding the Data Dependency of Mixup-style Training".

Mixup-Data-Dependency Code associated with the paper "Towards Understanding the Data Dependency of Mixup-style Training". Running Alternating Line Exp

Muthu Chidambaram 0 Nov 11, 2021
SPLADE: Sparse Lexical and Expansion Model for First Stage Ranking

SPLADE 🍴 + 🥄 = 🔎 This repository contains the weights for four models as well as the code for running inference for our two papers: [v1]: SPLADE: S

NAVER 170 Dec 28, 2022
An University Project of Quera Web Crawling.

WebCrawlerProject An University Project of Quera Web Crawling. خزشگر اینستاگرام در این پروژه شما باید با استفاده از کتابخانه های زیر یک خزشگر اینستاگر

Mahdi 3 Aug 12, 2022