Manifold-Mixup implementation for fastai V2

Overview

Manifold Mixup

Unofficial implementation of ManifoldMixup (Proceedings of ICML 19) for fast.ai (V2) based on Shivam Saboo's pytorch implementation of manifold mixup, fastai's input mixup implementation plus some improvements/variants that I developped with lessw2020.

This package provides four additional callbacks to the fastai learner :

  • ManifoldMixup which implements ManifoldMixup
  • OutputMixup which implements a variant that does the mixup only on the output of the last layer (this was shown to be more performant on a benchmark and an independant blogpost)
  • DynamicManifoldMixup which lets you use manifold mixup with a schedule to increase difficulty progressively
  • DynamicOutputMixup which lets you use manifold mixup with a schedule to increase difficulty progressively

Usage

For a minimal demonstration of the various callbacks and their parameters, see the Demo notebook.

Mixup

To use manifold mixup, you need to import manifold_mixup and pass the corresponding callback to the cbs argument of your learner :

learner = Learner(data, model, cbs=ManifoldMixup())
learner.fit(8)

The ManifoldMixup callback takes three parameters :

  • alpha=0.4 parameter of the beta law used to sample the interpolation weight
  • use_input_mixup=True do you want to apply mixup to the inputs
  • module_list=None can be used to pass an explicit list of target modules

The OutputMixup variant takes only the alpha parameters.

Dynamic mixup

Dynamic callbackss, which are available via dynamic_mixup, take three parameters instead of the single alpha parameter :

  • alpha_min=0.0 the initial, minimum, value for the parameter of the beta law used to sample the interpolation weight (we recommend keeping it to 0)
  • alpha_max=0.6 the final, maximum, value for the parameter of the beta law used to sample the interpolation weight
  • scheduler=SchedCos the scheduling function to describe the evolution of alpha from alpha_min to alpha_max

The default schedulers are SchedLin, SchedCos, SchedNo, SchedExp and SchedPoly. See the Annealing section of fastai2's documentation for more informations on available schedulers, ways to combine them and provide your own.

Notes

Which modules will be intrumented by ManifoldMixup ?

ManifoldMixup tries to establish a sensible list of modules on which to apply mixup:

  • it uses a user provided module_list if possible
  • otherwise it uses only the modules wrapped with ManifoldMixupModule
  • if none are found, it defaults to modules with Block or Bottleneck in their name (targetting mostly resblocks)
  • finaly, if needed, it defaults to all modules that are not included in the non_mixable_module_types list

The non_mixable_module_types list contains mostly recurrent layers but you can add elements to it in order to define module classes that should not be used for mixup (do not hesitate to create an issue or start a PR to add common modules to the default list).

When can I use OutputMixup ?

OutputMixup applies the mixup directly to the output of the last layer. This only works if the loss function contains something like a softmax (and not when it is directly used as it is for regression).

Thus, OutputMixup cannot be used for regression.

A note on skip-connections / residual-blocks

ManifoldMixup (this does not apply to OutputMixup) is greatly degraded when applied inside a residual block. This is due to the mixed-up values becoming incoherent with the output of the skip connection (which have not been mixed).

While this implementation is equiped to work around the problem for U-Net and ResNet like architectures, you might run into problems (negligeable improvements over the baseline) with other network structures. In which case, the best way to apply manifold mixup would be to manually select the modules to be instrumented.

For more unofficial fastai extensions, see the Fastai Extensions Repository.

Owner
Nestor Demeure
PhD, Engineer specialized in computer science and applied mathematics.
Nestor Demeure
Repository for MuSiQue: Multi-hop Questions via Single-hop Question Composition

🎵 MuSiQue: Multi-hop Questions via Single-hop Question Composition This is the repository for our paper "MuSiQue: Multi-hop Questions via Single-hop

21 Jan 02, 2023
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.

Website | Documentation | Tutorials | Installation | Release Notes CatBoost is a machine learning method based on gradient boosting over decision tree

CatBoost 6.9k Jan 04, 2023
Authors implementation of LieTransformer: Equivariant Self-Attention for Lie Groups

LieTransformer This repository contains the implementation of the LieTransformer used for experiments in the paper LieTransformer: Equivariant self-at

35 Oct 18, 2022
An unofficial styleguide and best practices summary for PyTorch

A PyTorch Tools, best practices & Styleguide This is not an official style guide for PyTorch. This document summarizes best practices from more than a

IgorSusmelj 1.5k Jan 05, 2023
Proof-Of-Concept Piano-Drums Music AI Model/Implementation

Rock Piano "When all is one and one is all, that's what it is to be a rock and not to roll." ---Led Zeppelin, "Stairway To Heaven" Proof-Of-Concept Pi

Alex 4 Nov 28, 2021
Point-NeRF: Point-based Neural Radiance Fields

Point-NeRF: Point-based Neural Radiance Fields Project Sites | Paper | Primary c

Qiangeng Xu 662 Jan 01, 2023
Pytorch implementation of MixNMatch

MixNMatch: Multifactor Disentanglement and Encoding for Conditional Image Generation [Paper] Yuheng Li, Krishna Kumar Singh, Utkarsh Ojha, Yong Jae Le

910 Dec 30, 2022
Official implementation of Densely connected normalizing flows

Densely connected normalizing flows This repository is the official implementation of NeurIPS 2021 paper Densely connected normalizing flows. Poster a

Matej Grcić 31 Dec 12, 2022
naked is a Python tool which allows you to strip a model and only keep what matters for making predictions.

naked is a Python tool which allows you to strip a model and only keep what matters for making predictions. The result is a pure Python function with no third-party dependencies that you can simply c

Max Halford 24 Dec 20, 2022
Code needed to reproduce the examples found in "The Temporal Robustness of Stochastic Signals"

The Temporal Robustness of Stochastic Signals Code needed to reproduce the examples found in "The Temporal Robustness of Stochastic Signals" Case stud

0 Oct 28, 2021
Accepted at ICCV-2021: Workshop on Computer Vision for Automated Medical Diagnosis (CVAMD)

Is it Time to Replace CNNs with Transformers for Medical Images? Accepted at ICCV-2021: Workshop on Computer Vision for Automated Medical Diagnosis (C

Christos Matsoukas 80 Dec 27, 2022
Modifications of the official PyTorch implementation of StyleGAN3. Let's easily generate images and videos with StyleGAN2/2-ADA/3!

Alias-Free Generative Adversarial Networks (StyleGAN3) Official PyTorch implementation of the NeurIPS 2021 paper Alias-Free Generative Adversarial Net

Diego Porres 185 Dec 24, 2022
DatasetGAN: Efficient Labeled Data Factory with Minimal Human Effort

DatasetGAN This is the official code and data release for: DatasetGAN: Efficient Labeled Data Factory with Minimal Human Effort Yuxuan Zhang*, Huan Li

302 Jan 05, 2023
Code for the paper "A Study of Face Obfuscation in ImageNet"

A Study of Face Obfuscation in ImageNet Code for the paper: A Study of Face Obfuscation in ImageNet Kaiyu Yang, Jacqueline Yau, Li Fei-Fei, Jia Deng,

35 Oct 04, 2022
Complex Answer Generation For Conversational Search Systems.

Complex Answer Generation For Conversational Search Systems. Code for Does Structure Matter? Leveraging Data-to-Text Generation for Answering Complex

Hanane Djeddal 0 Dec 06, 2021
Learning to Reach Goals via Iterated Supervised Learning

Vanilla GCSL This repository contains a vanilla implementation of "Learning to Reach Goals via Iterated Supervised Learning" proposed by Dibya Gosh et

Christoph Heindl 4 Aug 10, 2022
Implementation of the CVPR 2021 paper "Online Multiple Object Tracking with Cross-Task Synergy"

Online Multiple Object Tracking with Cross-Task Synergy This repository is the implementation of the CVPR 2021 paper "Online Multiple Object Tracking

54 Oct 15, 2022
Nightmare-Writeup - Writeup for the Nightmare CTF Challenge from 2022 DiceCTF

Nightmare: One Byte to ROP // Alternate Solution TLDR: One byte write, no leak.

1 Feb 17, 2022
Code for "SRHEN: Stepwise-Refining Homography Estimation Network via Parsing Geometric Correspondences in Deep Latent Space"

SRHEN This is a better and simpler implementation for "SRHEN: Stepwise-Refining Homography Estimation Network via Parsing Geometric Correspondences in

1 Oct 28, 2022
This repository contains several jupyter notebooks to help users learn to use neon, our deep learning framework

neon_course This repository contains several jupyter notebooks to help users learn to use neon, our deep learning framework. For more information, see

Nervana 92 Jan 03, 2023