Meta Learning Backpropagation And Improving It (VSML)

Overview

Meta Learning Backpropagation And Improving It (VSML)

This is research code for the NeurIPS 2021 publication Kirsch & Schmidhuber 2021.

Many concepts have been proposed for meta learning with neural networks (NNs), e.g., NNs that learn to reprogram fast weights, Hebbian plasticity, learned learning rules, and meta recurrent NNs. Our Variable Shared Meta Learning (VSML) unifies the above and demonstrates that simple weight-sharing and sparsity in an NN is sufficient to express powerful learning algorithms (LAs) in a reusable fashion. A simple implementation of VSML where the weights of a neural network are replaced by tiny LSTMs allows for implementing the backpropagation LA solely by running in forward-mode. It can even meta learn new LAs that differ from online backpropagation and generalize to datasets outside of the meta training distribution without explicit gradient calculation. Introspection reveals that our meta learned LAs learn through fast association in a way that is qualitatively different from gradient descent.

Installation

Create a virtual env

python3 -m venv venv
. venv/bin/activate

Install pip dependencies

pip3 install --upgrade pip wheel setuptools
pip3 install -r requirements.txt

Initialize weights and biases

wandb init

Inspect your results at https://wandb.ai/.

Run instructions

Non distributed

For any algorithm that does not require multiple workers.

python3 launch.py --config_files CONFIG_FILES --config arg1=val1 arg2=val2

Distributed

For any algorithm that does require multiple workers

GPU_COUNT=4 mpirun -n NUM_WORKERS python3 assign_gpu.py python3 launch.py

where NUM_WORKERS is the number of workers to run. The assign_gpu python script distributes the mpi workers evenly over the specified GPUs

Alternatively, specify the CUDA_VISIBLE_DEVICES instead of GPU_COUNT env variable:

CUDA_VISIBLE_DEVICES=0,2,3 mpirun -n NUM_WORKERS python3 assign_gpu.py python3 launch.py

Slurm-based cluster

Modify slurm/schedule.sh and slurm/job.sh to suit your environment.

bash slurm/schedule.sh --nodes=7 --ntasks-per-node=12 -- python3 launch.py --config_files CONFIG_FILES

If only a single worker is required (non-distributed), set --nodes=1 and --ntasks-per-node=1.

Remote (via ssh)

Modify ssh/schedule.sh to suit your environment. Requires gpustat in .local/bin/gpustat, via pip3 install --user gpustat. Also install tmux and mpirun.

bash ssh/schedule.sh --host HOST_NAME --nodes=7 --ntasks-per-node=12 -- python3 launch.py --config_files CONFIG_FILES

Example training runs

Section 4.2 Figure 6

VSML

slurm/schedule.py --nodes=128 --time 04:00:00 -- python3 launch.py --config_files configs/rand_proj.yaml

You can also try fewer nodes and use --config training.population_size=128. Or use backpropagation-based meta optimization --config_files configs/{rand_proj,backprop}.yaml.

Section 4.4 Figure 8

VSML

slurm/schedule.py --array=1-11 --nodes=128 --time 04:00:00 -- python3 launch.py --array configs/array/datasets.yaml

Meta RNN (Hochreiter 2001)

slurm/schedule.py --array=1-11 --nodes=32 --time 04:00:00 -- python3 launch.py --array configs/array/datasets.yaml --config_files configs/{metarnn,pad}.yaml --tags metarnn

Fast weight memory

slurm/schedule.py --array=1-11 --nodes=32 --time 04:00:00 -- python3 launch.py --array configs/array/datasets.yaml --config_files configs/{fwmemory,pad}.yaml --tags fwmemory

SGD

slurm/schedule.py --array=1-4 --nodes=2 --time 00:15:00 -- python3 launch.py --array configs/array/sgd.yaml --config_files configs/sgd.yaml --tags sgd

Hebbian

slurm/schedule.py --array=1-11 --nodes=32 --time 04:00:00 -- python3 launch.py --array configs/array/datasets.yaml --config_files configs/{hebbian,pad}.yaml --tags hebbian
Owner
Louis Kirsch
Building RL agents that meta-learn their own learning algorithm. Currently pursuing a PhD in AI at IDSIA with Jürgen Schmidhuber. Previous DeepMind intern.
Louis Kirsch
Try out deep learning models online on Google Colab

Try out deep learning models online on Google Colab

Erdene-Ochir Tuguldur 1.5k Dec 27, 2022
Malware Analysis Neural Network project.

MalanaNeuralNetwork Description Malware Analysis Neural Network project. Table of Contents Getting Started Requirements Installation Clone Set-Up VENV

2 Nov 13, 2021
Code release for Local Light Field Fusion at SIGGRAPH 2019

Local Light Field Fusion Project | Video | Paper Tensorflow implementation for novel view synthesis from sparse input images. Local Light Field Fusion

1.1k Dec 27, 2022
Credit fraud detection in Python using a Jupyter Notebook

Credit-Fraud-Detection - Credit fraud detection in Python using a Jupyter Notebook , using three classification models (Random Forest, Gaussian Naive Bayes, Logistic Regression) from the sklearn libr

Ali Akram 4 Dec 28, 2021
Learning to Communicate with Deep Multi-Agent Reinforcement Learning in PyTorch

Learning to Communicate with Deep Multi-Agent Reinforcement Learning This is a PyTorch implementation of the original Lua code release. Overview This

Minqi 297 Dec 12, 2022
Collaborative forensic timeline analysis

Timesketch Table of Contents About Timesketch Getting started Community Contributing About Timesketch Timesketch is an open-source tool for collaborat

Google 2.1k Dec 28, 2022
Pytorch implementation of RED-SDS (NeurIPS 2021).

Recurrent Explicit Duration Switching Dynamical Systems (RED-SDS) This repository contains a reference implementation of RED-SDS, a non-linear state s

Abdul Fatir 10 Dec 02, 2022
PyTorch implementation of paper: AdaAttN: Revisit Attention Mechanism in Arbitrary Neural Style Transfer, ICCV 2021.

AdaAttN: Revisit Attention Mechanism in Arbitrary Neural Style Transfer [Paper] [PyTorch Implementation] [Paddle Implementation] Overview This reposit

148 Dec 30, 2022
Object tracking implemented with YOLOv4, DeepSort, and TensorFlow.

Object tracking implemented with YOLOv4, DeepSort, and TensorFlow. YOLOv4 is a state of the art algorithm that uses deep convolutional neural networks to perform object detections. We can take the ou

The AI Guy 1.1k Dec 29, 2022
ZEBRA: Zero Evidence Biometric Recognition Assessment

ZEBRA: Zero Evidence Biometric Recognition Assessment license: LGPLv3 - please reference our paper version: 2020-06-11 author: Andreas Nautsch (EURECO

Voice Privacy Challenge 2 Dec 12, 2021
Code samples for my book "Neural Networks and Deep Learning"

Code samples for "Neural Networks and Deep Learning" This repository contains code samples for my book on "Neural Networks and Deep Learning". The cod

Michael Nielsen 13.9k Dec 26, 2022
CC-GENERATOR - A python script for generating CC

CC-GENERATOR A python script for generating CC NOTE: This tool is for Educationa

Lêkzï 6 Oct 14, 2022
A collection of pre-trained StyleGAN2 models trained on different datasets at different resolution.

Awesome Pretrained StyleGAN2 A collection of pre-trained StyleGAN2 models trained on different datasets at different resolution. Note the readme is a

Justin 1.1k Dec 24, 2022
A PaddlePaddle implementation of Time Interval Aware Self-Attentive Sequential Recommendation.

TiSASRec.paddle A PaddlePaddle implementation of Time Interval Aware Self-Attentive Sequential Recommendation. Introduction 论文:Time Interval Aware Sel

Paddorch 2 Nov 28, 2021
PyTorch code for JEREX: Joint Entity-Level Relation Extractor

JEREX: "Joint Entity-Level Relation Extractor" PyTorch code for JEREX: "Joint Entity-Level Relation Extractor". For a description of the model and exp

LAVIS - NLP Working Group 50 Dec 01, 2022
Implementation of a Transformer that Ponders, using the scheme from the PonderNet paper

Ponder(ing) Transformer Implementation of a Transformer that learns to adapt the number of computational steps it takes depending on the difficulty of

Phil Wang 65 Oct 04, 2022
GANSketchingJittor - Implementation of Sketch Your Own GAN in Jittor

GANSketching in Jittor Implementation of (Sketch Your Own GAN) in Jittor(计图). Or

Bernard Tan 10 Jul 02, 2022
5 Jan 05, 2023
Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

Apache MXNet (incubating) for Deep Learning Apache MXNet is a deep learning framework designed for both efficiency and flexibility. It allows you to m

The Apache Software Foundation 20.2k Jan 05, 2023
Graph Transformer Architecture. Source code for

Graph Transformer Architecture Source code for the paper "A Generalization of Transformer Networks to Graphs" by Vijay Prakash Dwivedi and Xavier Bres

NTU Graph Deep Learning Lab 561 Jan 08, 2023