Meta Learning Backpropagation And Improving It (VSML)

Overview

Meta Learning Backpropagation And Improving It (VSML)

This is research code for the NeurIPS 2021 publication Kirsch & Schmidhuber 2021.

Many concepts have been proposed for meta learning with neural networks (NNs), e.g., NNs that learn to reprogram fast weights, Hebbian plasticity, learned learning rules, and meta recurrent NNs. Our Variable Shared Meta Learning (VSML) unifies the above and demonstrates that simple weight-sharing and sparsity in an NN is sufficient to express powerful learning algorithms (LAs) in a reusable fashion. A simple implementation of VSML where the weights of a neural network are replaced by tiny LSTMs allows for implementing the backpropagation LA solely by running in forward-mode. It can even meta learn new LAs that differ from online backpropagation and generalize to datasets outside of the meta training distribution without explicit gradient calculation. Introspection reveals that our meta learned LAs learn through fast association in a way that is qualitatively different from gradient descent.

Installation

Create a virtual env

python3 -m venv venv
. venv/bin/activate

Install pip dependencies

pip3 install --upgrade pip wheel setuptools
pip3 install -r requirements.txt

Initialize weights and biases

wandb init

Inspect your results at https://wandb.ai/.

Run instructions

Non distributed

For any algorithm that does not require multiple workers.

python3 launch.py --config_files CONFIG_FILES --config arg1=val1 arg2=val2

Distributed

For any algorithm that does require multiple workers

GPU_COUNT=4 mpirun -n NUM_WORKERS python3 assign_gpu.py python3 launch.py

where NUM_WORKERS is the number of workers to run. The assign_gpu python script distributes the mpi workers evenly over the specified GPUs

Alternatively, specify the CUDA_VISIBLE_DEVICES instead of GPU_COUNT env variable:

CUDA_VISIBLE_DEVICES=0,2,3 mpirun -n NUM_WORKERS python3 assign_gpu.py python3 launch.py

Slurm-based cluster

Modify slurm/schedule.sh and slurm/job.sh to suit your environment.

bash slurm/schedule.sh --nodes=7 --ntasks-per-node=12 -- python3 launch.py --config_files CONFIG_FILES

If only a single worker is required (non-distributed), set --nodes=1 and --ntasks-per-node=1.

Remote (via ssh)

Modify ssh/schedule.sh to suit your environment. Requires gpustat in .local/bin/gpustat, via pip3 install --user gpustat. Also install tmux and mpirun.

bash ssh/schedule.sh --host HOST_NAME --nodes=7 --ntasks-per-node=12 -- python3 launch.py --config_files CONFIG_FILES

Example training runs

Section 4.2 Figure 6

VSML

slurm/schedule.py --nodes=128 --time 04:00:00 -- python3 launch.py --config_files configs/rand_proj.yaml

You can also try fewer nodes and use --config training.population_size=128. Or use backpropagation-based meta optimization --config_files configs/{rand_proj,backprop}.yaml.

Section 4.4 Figure 8

VSML

slurm/schedule.py --array=1-11 --nodes=128 --time 04:00:00 -- python3 launch.py --array configs/array/datasets.yaml

Meta RNN (Hochreiter 2001)

slurm/schedule.py --array=1-11 --nodes=32 --time 04:00:00 -- python3 launch.py --array configs/array/datasets.yaml --config_files configs/{metarnn,pad}.yaml --tags metarnn

Fast weight memory

slurm/schedule.py --array=1-11 --nodes=32 --time 04:00:00 -- python3 launch.py --array configs/array/datasets.yaml --config_files configs/{fwmemory,pad}.yaml --tags fwmemory

SGD

slurm/schedule.py --array=1-4 --nodes=2 --time 00:15:00 -- python3 launch.py --array configs/array/sgd.yaml --config_files configs/sgd.yaml --tags sgd

Hebbian

slurm/schedule.py --array=1-11 --nodes=32 --time 04:00:00 -- python3 launch.py --array configs/array/datasets.yaml --config_files configs/{hebbian,pad}.yaml --tags hebbian
Owner
Louis Kirsch
Building RL agents that meta-learn their own learning algorithm. Currently pursuing a PhD in AI at IDSIA with Jürgen Schmidhuber. Previous DeepMind intern.
Louis Kirsch
Continual Learning of Electronic Health Records (EHR).

Continual Learning of Longitudinal Health Records Repo for reproducing the experiments in Continual Learning of Longitudinal Health Records (2021). Re

Jacob 7 Oct 21, 2022
This repo contains source code and materials for the TEmporally COherent GAN SIGGRAPH project.

TecoGAN This repository contains source code and materials for the TecoGAN project, i.e. code for a TEmporally COherent GAN for video super-resolution

Nils Thuerey 5.2k Jan 02, 2023
PyTorch implementation for our paper "Deep Facial Synthesis: A New Challenge"

FSGAN Here is the official PyTorch implementation for our paper "Deep Facial Synthesis: A New Challenge". This project achieve the translation between

Deng-Ping Fan 32 Oct 10, 2022
The official implementation of VAENAR-TTS, a VAE based non-autoregressive TTS model.

VAENAR-TTS This repo contains code accompanying the paper "VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis". Sa

THUHCSI 138 Oct 28, 2022
K-FACE Analysis Project on Pytorch

Installation Setup with Conda # create a new environment conda create --name insightKface python=3.7 # or over conda activate insightKface #install t

Jung Jun Uk 7 Nov 10, 2022
Code for our paper "Interactive Analysis of CNN Robustness"

Perturber Code for our paper "Interactive Analysis of CNN Robustness" Datasets Feature visualizations: Google Drive Fine-tuning checkpoints as saved m

Stefan Sietzen 0 Aug 17, 2021
Learning-based agent for Google Research Football

TiKick 1.Introduction Learning-based agent for Google Research Football Code accompanying the paper "TiKick: Towards Playing Multi-agent Football Full

Tsinghua AI Research Team for Reinforcement Learning 90 Dec 26, 2022
The implementation of CVPR2021 paper Temporal Query Networks for Fine-grained Video Understanding, by Chuhan Zhang, Ankush Gupta and Andrew Zisserman.

Temporal Query Networks for Fine-grained Video Understanding 📋 This repository contains the implementation of CVPR2021 paper Temporal_Query_Networks

55 Dec 21, 2022
The official implementation of paper Siamese Transformer Pyramid Networks for Real-Time UAV Tracking, accepted by WACV22

SiamTPN Introduction This is the official implementation of the SiamTPN (WACV2022). The tracker intergrates pyramid feature network and transformer in

Robotics and Intelligent Systems Control @ NYUAD 28 Nov 25, 2022
Radar-to-Lidar: Heterogeneous Place Recognition via Joint Learning

radar-to-lidar-place-recognition This page is the coder of a pre-print, implemented by PyTorch. If you have some questions on this project, please fee

Huan Yin 37 Oct 09, 2022
PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation

PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation Created by Charles R. Qi, Hao Su, Kaichun Mo, Leonidas J. Guibas from Sta

Charles R. Qi 4k Dec 30, 2022
Detectron2-FC a fast construction platform of neural network algorithm based on detectron2

What is Detectron2-FC Detectron2-FC a fast construction platform of neural network algorithm based on detectron2. We have been working hard in two dir

董晋宗 9 Jun 06, 2022
Advanced Deep Learning with TensorFlow 2 and Keras (Updated for 2nd Edition)

Advanced Deep Learning with TensorFlow 2 and Keras (Updated for 2nd Edition)

Packt 1.5k Jan 03, 2023
Visualization toolkit for neural networks in PyTorch! Demo -->

FlashTorch A Python visualization toolkit, built with PyTorch, for neural networks in PyTorch. Neural networks are often described as "black box". The

Misa Ogura 692 Dec 29, 2022
A lane detection integrated Real-time Instance Segmentation based on YOLACT (You Only Look At CoefficienTs)

Real-time Instance Segmentation and Lane Detection This is a lane detection integrated Real-time Instance Segmentation based on YOLACT (You Only Look

Jin 4 Dec 30, 2022
Neural Message Passing for Computer Vision

Neural Message Passing for Quantum Chemistry Implementation of different models of Neural Networks on graphs as explained in the article proposed by G

Pau Riba 310 Nov 07, 2022
Pytorch implementation of Nueral Style transfer

Nueral Style Transfer Pytorch implementation of Nueral style transfer algorithm , it is used to apply artistic styles to content images . Content is t

Abhinav 9 Oct 15, 2022
PyExplainer: A Local Rule-Based Model-Agnostic Technique (Explainable AI)

PyExplainer PyExplainer is a local rule-based model-agnostic technique for generating explanations (i.e., why a commit is predicted as defective) of J

AI Wizards for Software Management (AWSM) Research Group 14 Nov 13, 2022
Dynamic Graph Event Detection

DyGED Dynamic Graph Event Detection Get Started pip install -r requirements.txt TODO Paper link to arxiv, and how to cite. Twitter Weather dataset tra

Mert Koşan 3 May 09, 2022
Deep universal probabilistic programming with Python and PyTorch

Getting Started | Documentation | Community | Contributing Pyro is a flexible, scalable deep probabilistic programming library built on PyTorch. Notab

7.7k Dec 30, 2022