ViViT: Curvature access through the generalized Gauss-Newton's low-rank structure

Last update: Dec 08, 2022

Related tags

Overview

[ 👷 🏗 👷 🏗 Coming soon! Official release with improved docs. Stay tuned. 👷 🏗 👷 🏗 ]

ViViT: Curvature access through the generalized Gauss-Newton's low-rank structure

[]

ViViT is a collection of numerical tricks to efficiently access curvature from the generalized Gauss-Newton (GGN) matrix based on its low-rank structure. Provided functionality includes computing

GGN eigenvalues
GGN eigenpairs (eigenvalues + eigenvector)
1ˢᵗ- and 2ⁿᵈ-order directional derivatives along GGN eigenvectors
Newton steps

These operations can also further approximate the GGN to reduce cost via sub-sampling, Monte-Carlo approximation, and block-diagonal approximation.

How does it work? ViViT uses and extends BackPACK for PyTorch. The described functionality is realized through a combination of existing and new BackPACK extensions and hooks into its backpropagation.

Installation

👷 🏗 👷 🏗 The PyPI release is coming soon. 👷 🏗 👷 🏗

For now, you need to install from GitHub via

pip install vivit-for-pytorch@git+https://github.com/f-dangel/vivit.git#egg=vivit-for-pytorch

Examples

👷 🏗 👷 🏗 Coming soon! 👷 🏗 👷 🏗

How to cite

If you are using ViViT, consider citing the paper

@misc{dangel2022vivit,
      title={{ViViT}: Curvature access through the generalized Gauss-Newton's low-rank structure},
      author={Felix Dangel and Lukas Tatzel and Philipp Hennig},
      year={2022},
      eprint={2106.02624},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

Comments

[ADD] Warn about instabilities if eigenvalues are small

The directional gradient computation and transformation of the Newton step from Gram space into parameter space require division by the square root of the direction's eigenvalue. This is unstable if the eigenvalue is close to zero.

opened by f-dangel 1
[ADD] Clean `DirectionalDampedNewtonComputation`
Adds directionally damped Newton step computation with cleaned up API.

Fixes a bug in the eigenvalue criterion in the tests. It always picked one more eigenvalue than specified.
opened by f-dangel 1
[DOC] Add NTK example

Adds an example inspired by the functorch tutorial on NTKs. It demonstrates how to use vivit to compute empirical NTK matrices and makes a comparison with the functorch implementation.

opened by f-dangel 1
[ADD] Simplify `DirectionalDerivatives` API
Exotic features, like using different GGNs to compute directions and directional curvatures, as well as full control of which intermediate buffers to keep, have been deprecated in favor of a simpler API.

Remove Newton step computation for now as it was internally relying on DirectionalDerivatives

Remove many utilities and associated tests from the exotic features

Forbid duplicate indices in subsampling

Always delete intermediate buffers other than the target quantities
opened by f-dangel 1
[DOC] Set up `sphinx` and RTD

This PR adds a scaffold for the doc at https://vivit.readthedocs.io/en/latest/. Code examples are integrated via sphinx-gallery (I added a preliminary logo). Pull requests are built by the CI.

To build the docs, run make docs. You need to install the dependencies first, for example using pip install -e .[docs].

opened by f-dangel 1
Calculate Parameter Space Values of GGN Eigenvectors

The docs show how to calculate the gram matrix eigenvectors and the paper articulates that to translate from 'gram space' to parameter space we just need to multiply by the 'V' matrix.

What's the easiest way of implementing this?
question

opened by lk-wq 1
Detect loss function's `reduction`, error if unsupported
For now, the library only supports reduction='mean'. We rely on the user to use this reduction and raise awareness about this point in the documentation. It would be better to automatically have the library detect the reduction and error if it is unsupported.

This can be done via a hook into BackPACK.

[ ] Implement hook that determines the loss function reduction during backpropagation

[ ] Integrate the above hook into the *Computation and raise an exception if the reduction is not supported

[ ] Remove the comments about supported reductions in the documentation

enhancement
opened by f-dangel 0

Releases(1.0.0)

1.0.0(Jun 22, 2022)

First public release. Details about future releases will be documented in the changelog.
Source code(tar.gz)
Source code(zip)

Owner

Felix Dangel

Machine Learning PhD student at the University of Tübingen and the Max Planck Institute for Intelligent Systems.

GitHub Repository https://arxiv.org/abs/2106.02624

code for our ECCV-2020 paper: Self-supervised Video Representation Learning by Pace Prediction

Video_Pace This repository contains the code for the following paper: Jiangliu Wang, Jianbo Jiao and Yunhui Liu, "Self-Supervised Video Representation

95 Dec 14, 2022

PartImageNet is a large, high-quality dataset with part segmentation annotations

PartImageNet: A Large, High-Quality Dataset of Parts We will release our dataset and scripts soon after cleaning and approval. Introduction PartImageN

77 Nov 30, 2022

A simple, unofficial implementation of MAE using pytorch-lightning

Masked Autoencoders in PyTorch A simple, unofficial implementation of MAE (Masked Autoencoders are Scalable Vision Learners) using pytorch-lightning.

20 Dec 03, 2022

Code for NeurIPS 2021 paper 'Spatio-Temporal Variational Gaussian Processes'

Spatio-Temporal Variational GPs This repository is the official implementation of the methods in the publication: O. Hamelijnck, W.J. Wilkinson, N.A.

26 Sep 16, 2022

PyTorch Code of "Memory In Memory: A Predictive Neural Network for Learning Higher-Order Non-Stationarity from Spatiotemporal Dynamics"

Memory In Memory Networks It is based on the paper Memory In Memory: A Predictive Neural Network for Learning Higher-Order Non-Stationarity from Spati

12 May 30, 2022

Neural Surface Maps

Neural Surface Maps Official implementation of Neural Surface Maps - Luca Morreale, Noam Aigerman, Vladimir Kim, Niloy J. Mitra [Paper] [Project Page]

49 Dec 13, 2022

Implementation of temporal pooling methods studied in [ICIP'20] A Comparative Evaluation Of Temporal Pooling Methods For Blind Video Quality Assessment

5 Sep 16, 2022

PyTorch implementation of MoCo: Momentum Contrast for Unsupervised Visual Representation Learning

MoCo: Momentum Contrast for Unsupervised Visual Representation Learning This is a PyTorch implementation of the MoCo paper: @Article{he2019moco, aut

3.7k Jan 02, 2023

2D Human Pose estimation using transformers. Implementation in Pytorch

PE-former: Pose Estimation Transformer Vision transformer architectures perform very well for image classification tasks. Efforts to solve more challe

23 Oct 17, 2022

Implementation of "Generalizable Neural Performer: Learning Robust Radiance Fields for Human Novel View Synthesis"

Generalizable Neural Performer: Learning Robust Radiance Fields for Human Novel View Synthesis Abstract: This work targets at using a general deep lea

163 Dec 14, 2022

AI grand challenge 2020 Repo (Speech Recognition Track)

KorBERT를 활용한 한국어 텍스트 기반 위협 상황인지(2020 인공지능 그랜드 챌린지) 본 프로젝트는 ETRI에서 제공된 한국어 korBERT 모델을 활용하여 폭력 기반 한국어 텍스트를 분류하는 다양한 분류 모델들을 제공합니다. 본 개발자들이 참여한 2020 인공지

23 Jan 25, 2022

Generative Flow Networks

Flow Network based Generative Models for Non-Iterative Diverse Candidate Generation Implementation for our paper, submitted to NeurIPS 2021 (also chec

381 Jan 04, 2023

[NeurIPS 2019] Learning Imbalanced Datasets with Label-Distribution-Aware Margin Loss

Learning Imbalanced Datasets with Label-Distribution-Aware Margin Loss Kaidi Cao, Colin Wei, Adrien Gaidon, Nikos Arechiga, Tengyu Ma This is the offi

528 Jan 01, 2023

Deep Inertial Prediction (DIPr)

Deep Inertial Prediction For more information and context related to this repo, please refer to our website. Getting Started (non Docker) Note: you wi

12 Nov 11, 2022

Air Pollution Prediction System using Linear Regression and ANN

AirPollution Pollution Weather Prediction System: Smart Outdoor Pollution Monitoring and Prediction for Healthy Breathing and Living Publication Link:

19 Feb 07, 2022

Astrostatistics class for the MSc degree in Astrophysics at the University of Milan-Bicocca (Italy)

Astrostatistics Davide Gerosa - [email protected] University of Mil

25 Jan 02, 2023

An end-to-end image translation model with weight-map for color constancy

CCUnet An end-to-end image translation model with weight-map for color constancy 1. Download the dataset (take Colorchecker_recommended dataset as an

1 Dec 21, 2021

Implementation of the Chamfer Distance as a module for pyTorch

Chamfer Distance for pyTorch This is an implementation of the Chamfer Distance as a module for pyTorch. It is written as a custom C++/CUDA extension.

205 Jan 05, 2023

Understanding the Properties of Minimum Bayes Risk Decoding in Neural Machine Translation.

Understanding Minimum Bayes Risk Decoding This repo provides code and documentation for the following paper: Müller and Sennrich (2021): Understanding

13 May 01, 2022

Dark Finix: All in one hacking framework with almost 100 tools

Dark Finix - Hacking Framework. Dark Finix is a all in one hacking framework wit

2 Feb 18, 2022

ViViT: Curvature access through the generalized Gauss-Newton's low-rank structure

Related tags

Overview

ViViT: Curvature access through the generalized Gauss-Newton's low-rank structure

Installation

Examples

How to cite

Comments

[ADD] Warn about instabilities if eigenvalues are small

[ADD] Clean `DirectionalDampedNewtonComputation`

[DOC] Add NTK example

[ADD] Simplify `DirectionalDerivatives` API

[DOC] Set up `sphinx` and RTD

Calculate Parameter Space Values of GGN Eigenvectors

Detect loss function's `reduction`, error if unsupported

Releases(1.0.0)

1.0.0(Jun 22, 2022)

Owner

Felix Dangel

code for our ECCV-2020 paper: Self-supervised Video Representation Learning by Pace Prediction

PartImageNet is a large, high-quality dataset with part segmentation annotations

A simple, unofficial implementation of MAE using pytorch-lightning

Code for NeurIPS 2021 paper 'Spatio-Temporal Variational Gaussian Processes'

PyTorch Code of "Memory In Memory: A Predictive Neural Network for Learning Higher-Order Non-Stationarity from Spatiotemporal Dynamics"

Neural Surface Maps

Implementation of temporal pooling methods studied in [ICIP'20] A Comparative Evaluation Of Temporal Pooling Methods For Blind Video Quality Assessment

PyTorch implementation of MoCo: Momentum Contrast for Unsupervised Visual Representation Learning

2D Human Pose estimation using transformers. Implementation in Pytorch

Implementation of "Generalizable Neural Performer: Learning Robust Radiance Fields for Human Novel View Synthesis"

AI grand challenge 2020 Repo (Speech Recognition Track)

Generative Flow Networks

[NeurIPS 2019] Learning Imbalanced Datasets with Label-Distribution-Aware Margin Loss

Deep Inertial Prediction (DIPr)

Air Pollution Prediction System using Linear Regression and ANN

Astrostatistics class for the MSc degree in Astrophysics at the University of Milan-Bicocca (Italy)

An end-to-end image translation model with weight-map for color constancy

Implementation of the Chamfer Distance as a module for pyTorch

Understanding the Properties of Minimum Bayes Risk Decoding in Neural Machine Translation.

Dark Finix: All in one hacking framework with almost 100 tools