Paper and Code for "Curriculum Learning by Optimizing Learning Dynamics" (AISTATS 2021)

Related tags

DocumentationDoCL
Overview

Curriculum Learning by Optimizing Learning Dynamics (DoCL)

AISTATS 2021 paper:

Title: Curriculum Learning by Optimizing Learning Dynamics [pdf] [appendix] [slides]
Authors: Tianyi Zhou, Shengjie Wang, Jeff A. Bilmes
Institute: University of Washington, Seattle

@inproceedings{
    zhou2020docl,
    title={Curriculum Learning by Optimizing Learning Dynamics},
    author={Tianyi Zhou and Shengjie Wang and Jeff A. Bilmes},
    booktitle={Proceedings of The 24th International Conference on Artificial Intelligence and Statistics (AISTATS)},
    year={2021},
}

Abstract
We study a novel curriculum learning scheme where in each round, samples are selected to achieve the greatest progress and fastest learning speed towards the ground-truth on all available samples. Inspired by an analysis of optimization dynamics under gradient flow for both regression and classification, the problem reduces to selecting training samples by a score computed from samples’ residual and linear temporal dynamics. It encourages the model to focus on the samples at learning frontier, i.e., those with large loss but fast learning speed. The scores in discrete time can be estimated via already-available byproducts of training, and thus require a negligible amount of extra computation. We discuss the properties and potential advantages of the proposed dynamics optimization via current deep learning theory and empirical study. By integrating it with cyclical training of neural networks, we introduce "dynamics-optimized curriculum learning (DoCL)", which selects the training set for each step by weighted sampling based on the scores. On nine different datasets, DoCL significantly outperforms random mini-batch SGD and recent curriculum learning methods both in terms of efficiency and final performance.

Usage

Prerequisites

Instructions

  • For now, we keep all the DoCL code in docl.py. It supports multiple datasets and models. You can add your own options.
  • Example scripts to run DoCL on CIFAR10/100 for training WideResNet-28-10 can be found in docl_cifar.sh.
  • We apply multiple episodes of training epochs, each following a cosine annealing learning rate decreasing from --lr_max to --lr_min. The episodes can be set by epoch numbers, for example, --epochs 300 --schedule 0 5 10 15 20 30 40 60 90 140 210 300.
  • DoCL reduces the selected subset's size over the training episodes, starting from n (the total number of training samples). Set how to reduce the size by --k 1.0 --dk 0.1 --mk 0.3 for example, which starts from a subset size (k * n) and multiplies it by (1 - dk) until reaching (mk * n).
  • To further reduce the subset in earlier epochs less than n and save more computation, add --use_centrality to further prune the DoCL-selected subset to a few diverse and representative samples according to samples' centrality (defined on pairwise similarity between samples). Set the corresponding selection ratio and how you want to change the ratio every episode, for example, --select_ratio 0.5 --select_ratio_rate 1.1 will further reduce the DoCL-selected subset to be its half size in the first non-warm-starting episode and then multiply this ratio by 1.1 for every future episode until selection_ratio = 1.
  • Centrality is an alternative of the facility location function in the paper in order to encourage diversity. The latter requires an external submodular maximization library and extra computation, compared to the centrality used here. We may add the option of submodular maximization in the future, but the centrality performs good enough on most tested tasks.
  • Self-supervised learning may help in some scenarios. Two types of self-supervision regularizations are supported, i.e., --consistency and --contrastive.
  • If one is interested to try DoCL on noisy-label learning (though not the focus of the paper), add --use_noisylabel and specify the noisy type and ratio using --label_noise_type and --label_noise_rate.

License
This project is licensed under the terms of the MIT license.

Owner
Tianyi Zhou
Tianyi Zhou
Projeto em Python colaborativo para o Bootcamp de Dados do Itaú em parceria com a Lets Code

🧾 lets-code-todo-list por Henrique V. Domingues e Josué Montalvão Projeto em Python colaborativo para o Bootcamp de Dados do Itaú em parceria com a L

Henrique V. Domingues 1 Jan 11, 2022
A document format conversion service based on Pandoc.

reformed Document format conversion service based on Pandoc. Usage The API specification for the Reformed server is as follows: GET /api/v1/formats: L

David Lougheed 3 Jul 18, 2022
This contains timezone mapping information for when preprocessed from the geonames data

when-data This contains timezone mapping information for when preprocessed from the geonames data. It exists in a separate repository so that one does

Armin Ronacher 2 Dec 07, 2021
The OpenAPI Specification Repository

The OpenAPI Specification The OpenAPI Specification is a community-driven open specification within the OpenAPI Initiative, a Linux Foundation Collabo

OpenAPI Initiative 25.5k Dec 29, 2022
VSCode extension that generates docstrings for python files

VSCode Python Docstring Generator Visual Studio Code extension to quickly generate docstrings for python functions. Features Quickly generate a docstr

Nils Werner 506 Jan 03, 2023
30 Days of google cloud leaderboard website

30 Days of Cloud Leaderboard This is a leaderboard for the students of Thapar, Patiala who are participating in the 2021 30 days of Google Cloud Platf

Developer Student Clubs TIET 13 Aug 25, 2022
Near Zero-Overhead Python Code Coverage

Slipcover: Near Zero-Overhead Python Code Coverage by Juan Altmayer Pizzorno and Emery Berger at UMass Amherst's PLASMA lab. About Slipcover Slipcover

PLASMA @ UMass 325 Dec 28, 2022
Tampilan - Change Termux Appearance With Python

Tampilan Gambar usage pkg update && pkg upgrade pkg install git && pkg install f

Creator Lord-Botz 1 Jan 31, 2022
Plotting and analysis tools for ARTIS simulations

Artistools Artistools is collection of plotting, analysis, and file format conversion tools for the ARTIS radiative transfer code. Installation First

ARTIS Monte Carlo Radiative Transfer 8 Nov 07, 2022
Python code for working with NFL play by play data.

nfl_data_py nfl_data_py is a Python library for interacting with NFL data sourced from nflfastR, nfldata, dynastyprocess, and Draft Scout. Includes im

82 Jan 05, 2023
charcade is a string manipulation library that can animate, color, and bruteforce strings

charcade charcade is a string manipulation library that can animate, color, and bruteforce strings. Features Animating text for CLI applications with

Aaron 8 May 23, 2022
Second version of SQL-PYTHON-Practicas

SQLite-Python Acerca de | Autor Sobre el repositorio Segunda version de SQL-PYTHON-Practicas 💻 Tecnologias Visual Studio Code Python SQLite3 📖 Requi

1 Jan 06, 2022
Automated Integration Testing and Live Documentation for your API

Automated Integration Testing and Live Documentation for your API

ScanAPI 1.3k Dec 30, 2022
freeCodeCamp Scientific Computing with Python Project for Certification.

Polygon_Area_Calculator freeCodeCamp Python Project freeCodeCamp Scientific Computing with Python Project for Certification. In this project you will

Rajdeep Mondal 1 Dec 23, 2021
🧙 A simple, typed and monad-based Result type for Python.

meiga 🧙 A simple, typed and monad-based Result type for Python. Table of Contents Installation 💻 Getting Started 📈 Example Features Result Function

Alice Biometrics 31 Jan 08, 2023
📘 OpenAPI/Swagger-generated API Reference Documentation

Generate interactive API documentation from OpenAPI definitions This is the README for the 2.x version of Redoc (React-based). The README for the 1.x

Redocly 19.2k Jan 02, 2023
The blazing-fast Discord bot.

Wavy Wavy is an open-source multipurpose Discord bot built with pycord. Wavy is still in development, so use it at your own risk. Tools and services u

Wavy 7 Dec 27, 2022
Software engineering course project. Secondhand trading system.

PigeonSale Software engineering course project. Secondhand trading system. Documentation API doumenatation: list of APIs Backend documentation: notes

Harry Lee 1 Sep 01, 2022
A simple document management REST based API for collaboratively interacting with documents

documan_api A simple document management REST based API for collaboratively interacting with documents.

Shahid Yousuf 1 Jan 22, 2022
Automatically open a pull request for repositories that have no CONTRIBUTING.md file

automatic-contrib-prs Automatically open a pull request for repositories that have no CONTRIBUTING.md file for a targeted set of repositories. What th

GitHub 8 Oct 20, 2022