Learning where to learn - Gradient sparsity in meta and continual learning

Last update: Dec 09, 2022

Related tags

Overview

Learning where to learn - Gradient sparsity in meta and continual learning

In this paper, we investigate gradient sparsity found by MAML in various continual and few-shot learning scenarios.
Instead of only learning the initialization of neural network parameters, we additionally meta-learn parameters underneath a step function that stops gradient descent when smaller then 0.

We term this version Sparse-MAML - Link to the paper here.

Interestingly, we see that structured sparsity emerges in both the classic 4-layer ConvNet as well as a ResNet-12 for few-shot learning. This is accompanied by improved robustness and generalisation across many hyperparameters.

Note that Sparse-MAML is an extremely simple variant of MAML that possesses only the possibility to shut on/off training of specific parameters compared to proper gradient modulation.

This codebase implents the few-shot learning experiments that are presented in the paper. To reproduce the results in the paper, please follow these instructions:

Installation

#1. Install a conda env:

conda create -n sparse-MAML

#2. Activate the env:

source activate sparse-MAML

#3. Install anaconda:

conda install anaconda

#4. Install extra requiremetns (make sure you use the correct pip3):

pip3 install -r requirements.txt

#5. Run:

chmod u+x run_sparse_MAML.sh

#6. Execute:

./run_sparse_MAML.sh

Results

MiniImageNet Few-Shot	MAML	ANIL	BOIL	sparse-MAML	sparse-ReLU-MAML
5-way 5-shot \| ConvNet	63.15	61.50	66.45	67.03	64.84
5-way 1-shot \| ConvNet	48.07	46.70	49.61	50.35	50.39
5-way 5-shot \| ResNet12	69.36	70.03	70.50	70.02	73.01
5-way 1-shot \| ResNet12	53.91	55.25	-	55.02	56.39

BOIL results are taken from the original paper.

This code based is heavily build on top of torchmeta.

Learning where to learn - Gradient sparsity in meta and continual learning

Related tags

Overview

Learning where to learn - Gradient sparsity in meta and continual learning

Installation

Results

Owner

Johannes Oswald

ALL Snow Removed: Single Image Desnowing Algorithm Using Hierarchical Dual-tree Complex Wavelet Representation and Contradict Channel Loss (HDCWNet)

Charsiu: A transformer-based phonetic aligner

Semantic Segmentation Architectures Implemented in PyTorch

A new codebase for Group Activity Recognition. It contains codes for ICCV 2021 paper: Spatio-Temporal Dynamic Inference Network for Group Activity Recognition and some other methods.

[CVPR 2021] Pytorch implementation of Hijack-GAN: Unintended-Use of Pretrained, Black-Box GANs

Trying to understand alias-free-gan.

Towards the D-Optimal Online Experiment Design for Recommender Selection (KDD 2021)

Embodied Intelligence via Learning and Evolution

A bare-bones TensorFlow framework for Bayesian deep learning and Gaussian process approximation

Using LSTM to detect spoofing attacks in an Air-Ground network

Source code for 2021 ICCV paper "In-the-Wild Single Camera 3D Reconstruction Through Moving Water Surfaces"

Unofficial implementation of HiFi-GAN+ from the paper "Bandwidth Extension is All You Need" by Su, et al.

Libraries, tools and tasks created and used at DeepMind Robotics.

AgML is a comprehensive library for agricultural machine learning

Cluttered MNIST Dataset

Optimizing DR with hard negatives and achieving SOTA first-stage retrieval performance on TREC DL Track (SIGIR 2021 Full Paper).

Deep Unsupervised 3D SfM Face Reconstruction Based on Massive Landmark Bundle Adjustment.

Code release for "Detecting Twenty-thousand Classes using Image-level Supervision".

《Deep Single Portrait Image Relighting》(ICCV 2019)

CR-Fill: Generative Image Inpainting with Auxiliary Contextual Reconstruction. ICCV 2021