Memoized coduals - Shows that it is possible to implement reverse mode autodiff using a variation on the dual numbers called the codual numbers

Overview

The dual numbers can do efficient autodiff!

The codual numbers are a simple method of doing automatic differentiation in reverse mode. They contrast with the dual numbers which provide an easy way of doing automatic differentiation in forward mode. The difference between the two modes is that sometimes one is faster than the other.

The folklore appears to be that forward mode autodiff is easy to implement because it can be done using the beautiful algebra of dual numbers, while the same is assumed to not be the case for reverse mode. This repository presents a counterargument that a variant of the dual numbers – called the codual numbers – can be used to represent an implementation of reverse mode autodiff that is just as elegant and terse as can be done for forward mode. This idea was first suggested by Sandro Magi (pseudonym: Naasking).

This implementation of the codual numbers differs from Sandro Magi’s by using simple memoisation to eliminate the exponential worst-case behaviour he encountered. In Magi’s original implementation, this idea seems obscured, largely because the code was more effectful and therefore the opportunity for memoisation was less apparent. The memoisation is achieved using only one additional line of code!

This implementation should be simpler and more transparent than Magi’s, I hope. It also suggests that Magi’s reasoning behind the term “codual numbers” is perhaps misleading.

Definition of dual number and codual number

The codual numbers are the set

\mathbb R \times \mathbb R,

while the codual numbers are a subset of

\mathbb R \times \mathbb R ^ {\mathbb R}

where the second component is always a linear map.

A notation that’s used to write a dual number is a + b \varepsilon, which stands for (a,b). Formally, \varepsilon^2 = 0 while \varepsilon \neq 0.

The codual numbers may be represented using exactly the same notation as the dual numbers. They are no different than the dual numbers, except in how they’re represented on a computer! Using lambda calculus notation (which I assume you are familiar with) any dual number (a,b) can be turned into the codual number (a, \lambda k. \,kb), and conversely every codual number (a,B) can be turned into the dual number (a,B(1)). The difference is merely one of data structure; we need a closure to represent the codual numbers.

The definition of an operation on the codual numbers can be inferred from its definition on the dual numbers. We demonstrate this using multiplication. For dual numbers, we may define multiplication by:

(a,a') \times (b,b') = (ab, ab' + ba').

For the codual numbers, we may use the correspondence (a,b') \mapsto (a, \lambda k. \,kb) to get:

(a,A) \times (b,B) = (ab, \lambda k. \,k\cdot(a\cdot B(1) + b\cdot A(1))),

where by “\cdot”, we mean multiplication of real numbers. Using the fact that A and B are linear maps, we can rearrange this to:

(a,A) \times (b,B) = (ab, \lambda k. \,B(ak) + A(bk))).

This is precisely how we define multiplication of codual numbers in the code.

Relationship with other autodiff strategies

It appears that there are three ways of doing reverse-mode autodiff, which correspond directly to the three “stages” of solving a problem using dynamic programming. See the table below:

Idea Example Corresponding autodiff algorithm
Unmemoised recursion Exhibit A Unmemoised coduals
Memoised recursion, or
top-down dynamic programming
Exhibit B Memoised coduals
Bottom-up dynamic programming Exhibit C Tape-based autodiff

This suggests that the tape-based approach can be derived from the coduals.

Exhibit A:

def fib(n):
    if n == 0 or n == 1:
        return n
    else:
        return fib(n-1) + fib(n-2)

Exhibit B:

from functools import cache

@cache
def fib(n):
    if n == 0 or n == 1:
        return n
    else:
        return fib(n-1) + fib(n-2)

Exhibit C:

def fib(n):
    a, b = 0, 1
    for _ in range(n):
        a, b = b, a + b
    return a
Owner
wlad
wlad
Code for the paper "Graph Attention Tracking". (CVPR2021)

SiamGAT 1. Environment setup This code has been tested on Ubuntu 16.04, Python 3.5, Pytorch 1.2.0, CUDA 9.0. Please install related libraries before r

122 Dec 24, 2022
Retinal Vessel Segmentation with Pixel-wise Adaptive Filters (ISBI 2022)

Retinal Vessel Segmentation with Pixel-wise Adaptive Filters (ISBI 2022) Introdu

anonymous 14 Oct 27, 2022
Official Code for VideoLT: Large-scale Long-tailed Video Recognition (ICCV 2021)

Pytorch Code for VideoLT [Website][Paper] Updates [10/29/2021] Features uploaded to Google Drive, for access please send us an e-mail: zhangxing18 at

Skye 26 Sep 18, 2022
Official implementation of Deep Reparametrization of Multi-Frame Super-Resolution and Denoising

Deep-Rep-MFIR Official implementation of Deep Reparametrization of Multi-Frame Super-Resolution and Denoising Publication: Deep Reparametrization of M

Goutam Bhat 39 Jan 04, 2023
MANO hand model porting for the GraspIt simulator

Learning Joint Reconstruction of Hands and Manipulated Objects - ManoGrasp Porting the MANO hand model to GraspIt! simulator Yana Hasson, Gül Varol, D

Lucas Wohlhart 10 Feb 08, 2022
Code for pre-training CharacterBERT models (as well as BERT models).

Pre-training CharacterBERT (and BERT) This is a repository for pre-training BERT and CharacterBERT. DISCLAIMER: The code was largely adapted from an o

Hicham EL BOUKKOURI 31 Dec 05, 2022
Perform zero-order Hankel Transform for an 1D array (float or real valued).

perform zero-order Hankel Transform for an 1D array (float or real valued). An discrete form of Parseval theorem is guaranteed. Suit for iterative problems.

1 Jan 17, 2022
Tensorflow implementation of the paper "HumanGPS: Geodesic PreServing Feature for Dense Human Correspondences", CVPR 2021.

HumanGPS: Geodesic PreServing Feature for Dense Human Correspondences Tensorflow implementation of the paper "HumanGPS: Geodesic PreServing Feature fo

Google Interns 50 Dec 21, 2022
From a body shape, infer the anatomic skeleton.

OSSO: Obtaining Skeletal Shape from Outside (CVPR 2022) This repository contains the official implementation of the skeleton inference from: OSSO: Obt

Marilyn Keller 166 Dec 28, 2022
Modified prey-predator system - Modified prey–predator model describes the rate of change for each species by adding coupling terms.

Modified prey-predator system We aim to study the behaviors of the modified prey–predator model and establish the effects of several parameters that p

Seoyoung Oh 1 Jan 02, 2022
A python implementation of Physics-informed Spline Learning for nonlinear dynamics discovery

PiSL A python implementation of Physics-informed Spline Learning for nonlinear dynamics discovery. Sun, F., Liu, Y. and Sun, H., 2021. Physics-informe

Fangzheng (Andy) Sun 8 Jul 13, 2022
Train a state-of-the-art yolov3 object detector from scratch!

TrainYourOwnYOLO: Building a Custom Object Detector from Scratch This repo let's you train a custom image detector using the state-of-the-art YOLOv3 c

AntonMu 616 Jan 08, 2023
2D Time independent Schrodinger equation solver for arbitrary shape of well

Schrodinger Well Python Python solver for timeless Schrodinger equation for well with arbitrary shape https://imgur.com/a/jlhK7OZ Pictures of circular

WeightAn 24 Nov 18, 2022
Painting app using Python machine learning and vision technology.

AI Painting App We are making an app that will track our hand and helps us to draw from that. We will be using the advance knowledge of Machine Learni

Badsha Laskar 3 Oct 03, 2022
PyTorch implementation of MulMON

MulMON This repository contains a PyTorch implementation of the paper: Learning Object-Centric Representations of Multi-object Scenes from Multiple Vi

NanboLi 16 Nov 03, 2022
Code for "Discovering Non-monotonic Autoregressive Orderings with Variational Inference" (paper and code updated from ICLR 2021)

Discovering Non-monotonic Autoregressive Orderings with Variational Inference Description This package contains the source code implementation of the

Xuanlin (Simon) Li 10 Dec 29, 2022
This is an official implementation for "DeciWatch: A Simple Baseline for 10x Efficient 2D and 3D Pose Estimation"

DeciWatch: A Simple Baseline for 10× Efficient 2D and 3D Pose Estimation This repo is the official implementation of "DeciWatch: A Simple Baseline for

117 Dec 24, 2022
Multi-modal Content Creation Model Training Infrastructure including the FACT model (AI Choreographer) implementation.

AI Choreographer: Music Conditioned 3D Dance Generation with AIST++ [ICCV-2021]. Overview This package contains the model implementation and training

Google Research 365 Dec 30, 2022
SmartSim Infrastructure Library.

Home Install Documentation Slack Invite Cray Labs SmartSim SmartSim makes it easier to use common Machine Learning (ML) libraries like PyTorch and Ten

Cray Labs 139 Jan 01, 2023
Official pytorch implementation of the IrwGAN for unaligned image-to-image translation

IrwGAN (ICCV2021) Unaligned Image-to-Image Translation by Learning to Reweight [Update] 12/15/2021 All dataset are released, trained models and genera

37 Nov 09, 2022