Manifold-Mixup implementation for fastai V2

Overview

Manifold Mixup

Unofficial implementation of ManifoldMixup (Proceedings of ICML 19) for fast.ai (V2) based on Shivam Saboo's pytorch implementation of manifold mixup, fastai's input mixup implementation plus some improvements/variants that I developped with lessw2020.

This package provides four additional callbacks to the fastai learner :

  • ManifoldMixup which implements ManifoldMixup
  • OutputMixup which implements a variant that does the mixup only on the output of the last layer (this was shown to be more performant on a benchmark and an independant blogpost)
  • DynamicManifoldMixup which lets you use manifold mixup with a schedule to increase difficulty progressively
  • DynamicOutputMixup which lets you use manifold mixup with a schedule to increase difficulty progressively

Usage

For a minimal demonstration of the various callbacks and their parameters, see the Demo notebook.

Mixup

To use manifold mixup, you need to import manifold_mixup and pass the corresponding callback to the cbs argument of your learner :

learner = Learner(data, model, cbs=ManifoldMixup())
learner.fit(8)

The ManifoldMixup callback takes three parameters :

  • alpha=0.4 parameter of the beta law used to sample the interpolation weight
  • use_input_mixup=True do you want to apply mixup to the inputs
  • module_list=None can be used to pass an explicit list of target modules

The OutputMixup variant takes only the alpha parameters.

Dynamic mixup

Dynamic callbackss, which are available via dynamic_mixup, take three parameters instead of the single alpha parameter :

  • alpha_min=0.0 the initial, minimum, value for the parameter of the beta law used to sample the interpolation weight (we recommend keeping it to 0)
  • alpha_max=0.6 the final, maximum, value for the parameter of the beta law used to sample the interpolation weight
  • scheduler=SchedCos the scheduling function to describe the evolution of alpha from alpha_min to alpha_max

The default schedulers are SchedLin, SchedCos, SchedNo, SchedExp and SchedPoly. See the Annealing section of fastai2's documentation for more informations on available schedulers, ways to combine them and provide your own.

Notes

Which modules will be intrumented by ManifoldMixup ?

ManifoldMixup tries to establish a sensible list of modules on which to apply mixup:

  • it uses a user provided module_list if possible
  • otherwise it uses only the modules wrapped with ManifoldMixupModule
  • if none are found, it defaults to modules with Block or Bottleneck in their name (targetting mostly resblocks)
  • finaly, if needed, it defaults to all modules that are not included in the non_mixable_module_types list

The non_mixable_module_types list contains mostly recurrent layers but you can add elements to it in order to define module classes that should not be used for mixup (do not hesitate to create an issue or start a PR to add common modules to the default list).

When can I use OutputMixup ?

OutputMixup applies the mixup directly to the output of the last layer. This only works if the loss function contains something like a softmax (and not when it is directly used as it is for regression).

Thus, OutputMixup cannot be used for regression.

A note on skip-connections / residual-blocks

ManifoldMixup (this does not apply to OutputMixup) is greatly degraded when applied inside a residual block. This is due to the mixed-up values becoming incoherent with the output of the skip connection (which have not been mixed).

While this implementation is equiped to work around the problem for U-Net and ResNet like architectures, you might run into problems (negligeable improvements over the baseline) with other network structures. In which case, the best way to apply manifold mixup would be to manually select the modules to be instrumented.

For more unofficial fastai extensions, see the Fastai Extensions Repository.

Owner
Nestor Demeure
PhD, Engineer specialized in computer science and applied mathematics.
Nestor Demeure
Target Propagation via Regularized Inversion

Target Propagation via Regularized Inversion The present code implements an ideal formulation of target propagation using regularized inverses compute

Vincent Roulet 0 Dec 02, 2021
This is the official PyTorch implementation of our paper: "Artistic Style Transfer with Internal-external Learning and Contrastive Learning".

Artistic Style Transfer with Internal-external Learning and Contrastive Learning This is the official PyTorch implementation of our paper: "Artistic S

51 Dec 20, 2022
SAAVN - Sound Adversarial Audio-Visual Navigation,ICLR2022 (In PyTorch)

SAAVN SAAVN Code release for paper "Sound Adversarial Audio-Visual Navigation,IC

YinfengYu 10 Aug 30, 2022
IDM: An Intermediate Domain Module for Domain Adaptive Person Re-ID,

Intermediate Domain Module (IDM) This repository is the official implementation for IDM: An Intermediate Domain Module for Domain Adaptive Person Re-I

Yongxing Dai 87 Nov 22, 2022
A blender add-on that automatically re-aligns wrong axis objects.

Auto Align A blender add-on that automatically re-aligns wrong axis objects. Usage There are three options available in the 3D Viewport Sidebar It

29 Nov 25, 2022
Implementing DropPath/StochasticDepth in PyTorch

%load_ext memory_profiler Implementing Stochastic Depth/Drop Path In PyTorch DropPath is available on glasses my computer vision library! Introduction

Francesco Saverio Zuppichini 13 Jan 05, 2023
Determined: Deep Learning Training Platform

Determined: Deep Learning Training Platform Determined is an open-source deep learning training platform that makes building models fast and easy. Det

Determined AI 2k Dec 31, 2022
Codes for [NeurIPS'21] You are caught stealing my winning lottery ticket! Making a lottery ticket claim its ownership.

You are caught stealing my winning lottery ticket! Making a lottery ticket claim its ownership Codes for [NeurIPS'21] You are caught stealing my winni

VITA 8 Nov 01, 2022
CLIP (Contrastive Language–Image Pre-training) trained on Indonesian data

CLIP-Indonesian CLIP (Radford et al., 2021) is a multimodal model that can connect images and text by training a vision encoder and a text encoder joi

Galuh 17 Mar 10, 2022
Python library for tracking human heads with FLAME (a 3D morphable head model)

Video Head Tracker 3D tracking library for human heads based on FLAME (a 3D morphable head model). The tracking algorithm is inspired by face2face. It

61 Dec 25, 2022
Rename Images with Auto Generated Neural Image Captions

Recaption Images with Generated Neural Image Caption Example Usage: Commandline: Recaption all images from folder /home/feng/Downloads/images to folde

feng wang 3 May 01, 2022
Optimized code based on M2 for faster image captioning training

Transformer Captioning This repository contains the code for Transformer-based image captioning. Based on meshed-memory-transformer, we further optimi

lyricpoem 16 Dec 16, 2022
Yolo ros - YOLO-ROS for HUAWEI ATLAS200

YOLO-ROS YOLO-ROS for NVIDIA YOLO-ROS for HUAWEI ATLAS200, please checkout for b

ChrisLiu 5 Oct 18, 2022
Sparse-dense operators implementation for Paddle

Sparse-dense operators implementation for Paddle This module implements coo, csc and csr matrix formats and their inter-ops with dense matrices. Feel

北海若 3 Dec 17, 2022
Writeups for the challenges from DownUnderCTF 2021

cloud Challenge Author Difficulty Release Round Bad Bucket Blue Alder easy round 1 Not as Bad Bucket Blue Alder easy round 1 Lost n Found Blue Alder m

DownUnderCTF 161 Dec 31, 2022
Research Artifact of USENIX Security 2022 Paper: Automated Side Channel Analysis of Media Software with Manifold Learning

Automated Side Channel Analysis of Media Software with Manifold Learning Official implementation of USENIX Security 2022 paper: Automated Side Channel

Yuanyuan Yuan 175 Jan 07, 2023
Simple Linear 2nd ODE Solver GUI - A 2nd constant coefficient linear ODE solver with simple GUI using euler's method

Simple_Linear_2nd_ODE_Solver_GUI Description It is a 2nd constant coefficient li

:) 4 Feb 05, 2022
Datasets and source code for our paper Webly Supervised Fine-Grained Recognition: Benchmark Datasets and An Approach

Introduction Datasets and source code for our paper Webly Supervised Fine-Grained Recognition: Benchmark Datasets and An Approach Datasets: WebFG-496

21 Sep 30, 2022
[Official] Exploring Temporal Coherence for More General Video Face Forgery Detection(ICCV 2021)

Exploring Temporal Coherence for More General Video Face Forgery Detection(FTCN) Yinglin Zheng, Jianmin Bao, Dong Chen, Ming Zeng, Fang Wen Accepted b

57 Dec 28, 2022
Interpretable-contrastive-word-mover-s-embedding

Interpretable-contrastive-word-mover-s-embedding Paper Datasets Here is a Dropbox link to the datasets used in the paper: https://www.dropbox.com/sh/n

0 Nov 02, 2021