Efficiently computes derivatives of numpy code.

Last update: Jan 08, 2023

Related tags

Overview

Note: Autograd is still being maintained but is no longer actively developed. The main developers (Dougal Maclaurin, David Duvenaud, Matt Johnson, and Jamie Townsend) are now working on JAX, with Dougal and Matt working on it full-time. JAX combines a new version of Autograd with extra features such as jit compilation.

Autograd

Autograd can automatically differentiate native Python and Numpy code. It can handle a large subset of Python's features, including loops, ifs, recursion and closures, and it can even take derivatives of derivatives of derivatives. It supports reverse-mode differentiation (a.k.a. backpropagation), which means it can efficiently take gradients of scalar-valued functions with respect to array-valued arguments, as well as forward-mode differentiation, and the two can be composed arbitrarily. The main intended application of Autograd is gradient-based optimization. For more information, check out the tutorial and the examples directory.

Example use:

>>> import autograd.numpy as np  # Thinly-wrapped numpy
>>> from autograd import grad    # The only autograd function you may ever need
>>>
>>> def tanh(x):                 # Define a function
...     y = np.exp(-2.0 * x)
...     return (1.0 - y) / (1.0 + y)
...
>>> grad_tanh = grad(tanh)       # Obtain its gradient function
>>> grad_tanh(1.0)               # Evaluate the gradient at x = 1.0
0.41997434161402603
>>> (tanh(1.0001) - tanh(0.9999)) / 0.0002  # Compare to finite differences
0.41997434264973155

We can continue to differentiate as many times as we like, and use numpy's vectorization of scalar-valued functions across many different input values:

>>> from autograd import elementwise_grad as egrad  # for functions that vectorize over inputs
>>> import matplotlib.pyplot as plt
>>> x = np.linspace(-7, 7, 200)
>>> plt.plot(x, tanh(x),
...          x, egrad(tanh)(x),                                     # first  derivative
...          x, egrad(egrad(tanh))(x),                              # second derivative
...          x, egrad(egrad(egrad(tanh)))(x),                       # third  derivative
...          x, egrad(egrad(egrad(egrad(tanh))))(x),                # fourth derivative
...          x, egrad(egrad(egrad(egrad(egrad(tanh)))))(x),         # fifth  derivative
...          x, egrad(egrad(egrad(egrad(egrad(egrad(tanh))))))(x))  # sixth  derivative
>>> plt.show()

See the tanh example file for the code.

Documentation

You can find a tutorial here.

End-to-end examples

How to install

Just run pip install autograd

Authors

Autograd was written by Dougal Maclaurin, David Duvenaud, Matt Johnson, Jamie Townsend and many other contributors. The package is currently still being maintained, but is no longer actively developed. Please feel free to submit any bugs or feature requests. We'd also love to hear about your experiences with autograd in general. Drop us an email!

We want to thank Jasper Snoek and the rest of the HIPS group (led by Prof. Ryan P. Adams) for helpful contributions and advice; Barak Pearlmutter for foundational work on automatic differentiation and for guidance on our implementation; and Analog Devices Inc. (Lyric Labs) and Samsung Advanced Institute of Technology for their generous support.

Comments

Forward mode
This probably isn't ready to be pulled into the master branch yet, but I thought I'd submit a pr in case you want to track progress.

TODO:

[x] Implement the rest of the numpy grads

[x] Tests for remaining grads

[x] Write a jacobian_vector_product convenience wrapper

[x] Update the hessian_vector_product wrapper to use forward mode

[x] Ensure that nodes with only forward mode grads don't refer to their parents (so that garbage collection can work)

[ ] Implement a jacobian matrix product
opened by j-towns 83

Documenting CuPy wrapper progress

Starting this issue to document progress on wrapping CuPy.

[x] import autograd.cupy as cp
[x] instantiate arrays from scalars, lists, and tuples.

cp.array(1)
cp.array([1, 2])
cp.array([1, 3]) + cp.array([1, 1])

[x] check that gradients work

import autograd.cupy as cp
from autograd import elementwise_grad as egrad

def f(x):
    return cp.sin(x)

def g(x):
    return x + 2

df = egrad(f)
dg = egrad(g)

a = cp.array([1, 1])

print(f(a))
print(df(a))

print(g(a))
print(dg(a))

[x] Check that higher derivatives work.

import autograd.cupy as cp
from autograd import elementwise_grad as egrad
import numpy as np

a = cp.arange(-2 * np.pi, 2 * np.pi, 0.01)

def sin(x):
    return cp.sin(x)

dsin = egrad(sin)
ddsin = egrad(dsin)

sin(a)
dsin(a)
ddsin(a)

[ ] Fix ValueError: object __array__ method not producing an array.
[ ] Run tests for all of the CuPy wrapped functions.

opened by ericmjl 25

Decreasing autograd memory usage

I don't mean "memory leak" in terms of unreachable memory after the Python process quits, I mean memory that is being allocated in the backwards pass, when it should be being freed. I mentioned this problem in #199 , but I think it should be opened as an issue.

For a simple function

import autograd.numpy as np
from autograd import grad

def F(x,z):
    for i in range(100):
        z = np.dot(x,z)
    return np.sum(z)
dF = grad(F)

and a procedure to measure memory usage

from memory_profiler import memory_usage
def make_data():
    np.random.seed(0)
    D = 1000
    x = np.random.randn(D,D)
    x = np.dot(x,x.T)
    z = np.random.randn(D,D)
    return x,z

def m():
    from time import sleep
    x,z = make_data()
    gx = dF(x,z)
    sleep(0.1)
    return gx

mem_usage = np.array(memory_usage(m,interval=0.01))
mem_usage -= mem_usage[0]

and a manual gradient of the same function

def grad_dot_A(g,A,B):
    ga = np.dot(g,B.T)
    ga = np.reshape(ga,np.shape(A))
    return ga

def grad_dot_B(g,A,B):
    gb = np.dot(A.T,g)
    gb = np.reshape(gb, np.shape(B))
    return gb

def dF(x, z):
    z_stack = []
    for i in list(range(100)):
        z_stack.append(z)
        z = np.dot(x, z)
    retval = np.sum(z)

    # Begin Backward Pass
    g_retval = 1
    g_x = 0

    # Reverse of: retval = np.sum(z)
    g_z = repeat_to_match_shape(g_retval, z)
    for i in reversed(list(range(100))):

        # Reverse of: z = np.dot(x, z)
        z = z_stack.pop()
        tmp_g0 = grad_dot_A(g_z, x, z)
        tmp_g1 = grad_dot_B(g_z, x, z)
        g_z = 0
        g_x += tmp_g0
        g_z += tmp_g1
    return g_x

I get the following memory usage profile:

If I replace the dot gradient with the ones used in the manual code, I get the same memory profile, nothing improves.

If I replace the dot product with element-wise multiply, I get a different memory profile, but still not what I would expect:

I would love to help figure this out, but I'm not sure where to start. First thing is of course to document the problem.

opened by alexbw 21

Memory issue?
I've run into an issue with large matrices and memory. There seem to be two problems:

Memory isn't being released on successive calls of grad. e.g.

a = 10000 b = 10000 A = np.random.randn(a) B = np.random.randn(b) def fn(x): M = A[:, na] + x[na, :] return M[0, 0] g = grad(fn) for i in range(100): g(B)

is ramping up memory on each iteration.

Memory isn't being released during the backwards pass e.g.

k = 10 def fn(x): res = 0 for i in range(k): res = res + np.sum(x) return res g = grad(fn) b = 200000 g(np.random.randn(b))

This seems to scale in memory (for each call) as O(k), which don't think is the desired behaviour. For b=150000 this effect does not happen, however.
opened by hughsalimbeni 16
Experimental reorganization

This is mostly just a cosmetic reorganization. The main motivation was to expose a well-defined API for extending Autograd in a single module, extend.py. The only functions/classes that we should need for wrapping a numerical libary are primitive/defvjp*/defjvp* for defining new primitive functions, Box/VSpace for defining new types, and SparseObject. In practice, we also use functions like vspace, getval and isbox but we should try to avoid them.

This PR includes the commits from #293, so we need to be happy with the performance before merging with dev-1.2. I made a small optimization to defvjp which might help.

While we're renaming things, I wouldn't mind finally changing our JVP/VJP convention to something more obvious. JO/JTO? fwd_op/rev_op?

opened by dougalm 14

Experiment: combo VJPs

I'm not recommending we merge this yet! It's just an experiment for now, and I'm opening this PR to track progress.

Check out the change to backward_pass. It seems like a good idea to allow users to write VJP functions that evaluate the VJP wrt multiple positional arguments simultaneously, mainly because that can allow for work sharing (instead of always having separate calls).

However, the implementation mechanism here seems to hurt performance a lot:

   before     after       ratio
  [fb7eccf6] [c163e986]
+  170.82μs   266.18μs      1.56  bench_core.time_long_backward_pass
+  536.46μs   827.32μs      1.54  bench_core.time_long_grad
+    5.69μs     8.66μs      1.52  bench_core.time_exp_primitive_call_boxed
+  312.14μs   446.58μs      1.43  bench_core.time_long_forward_pass
+     2.02y      2.87y      1.42  bench_rnn.RNNSuite.peakmem_manual_rnn_grad
+  129.84μs   178.60μs      1.38  bench_numpy_vjps.time_tensordot_1_1
+   10.75μs    14.28μs      1.33  bench_core.time_short_backward_pass
+  101.26μs   133.52μs      1.32  bench_numpy_vjps.time_tensordot_0_0
+  274.72ms   349.04ms      1.27  bench_core.time_fan_out_fan_in_forward_pass
+   67.22μs    82.32μs      1.22  bench_numpy_vjps.time_tensordot_0
+   64.15μs    78.47μs      1.22  bench_numpy_vjps.time_dot_0
+  447.36ms   545.34ms      1.22  bench_core.time_fan_out_fan_in_grad
+  116.58μs   136.20μs      1.17  bench_numpy_vjps.time_dot_1_2
+  126.47μs   143.47μs      1.13  bench_numpy_vjps.time_tensordot_1_2
+  281.75ms   319.34ms      1.13  bench_core.time_fan_out_fan_in_backward_pass
+  127.47μs   144.33μs      1.13  bench_numpy_vjps.time_tensordot_1_0
+  118.83μs   134.47μs      1.13  bench_numpy_vjps.time_dot_1_0
+   23.82μs    26.71μs      1.12  bench_core.time_short_forward_pass
+  121.13μs   135.21μs      1.12  bench_numpy_vjps.time_dot_1_1
+  102.82μs   114.14μs      1.11  bench_numpy_vjps.time_tensordot_0_2
+  102.39μs   113.46μs      1.11  bench_numpy_vjps.time_tensordot_0_1
+     1.93y      2.13y      1.11  bench_rnn.RNNSuite.peakmem_rnn_grad
+   69.91μs    77.02μs      1.10  bench_numpy_vjps.time_tensordot_1
-     2.67s      2.23s      0.83  bench_rnn.RNNSuite.time_rnn_grad

SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.

opened by mattjj 14

Simplify util.flatten
Use the vspace flatten functionality for util.flatten. This enables flattening of complex values which wasn't previously possible.

Am I missing some reason why this isn't ok? The only difference in functionality which I can see is that with this change, calling flatten on scalars will give an unflatten which returns scalar values wrapped in an array. In particular:

On master

In [1]: from autograd.util import flatten In [2]: v, unflatten = flatten(3.) In [3]: v Out[3]: array([ 3.]) In [4]: unflatten(v) Out[4]: 3.0

With this change:

In [1]: from autograd.util import flatten In [2]: v, unflatten = flatten(3.) In [3]: v Out[3]: array([ 3.]) In [4]: unflatten(v) Out[4]: array(3.0)

Is this a big deal? This type of behaviour occurs in other situations where vspace is applied to scalars, for example:

In [7]: from autograd import grad In [8]: def f(x): ...: return 3. ...: In [9]: grad(f)(2.) /Users/jamietownsend/dev/autograd/autograd/core.py:16: UserWarning: Output seems independent of input. warnings.warn("Output seems independent of input.") Out[9]: array(0.0)

and I don't think that's really a problem.
opened by j-towns 14
Dynd support

Dynd is a next generation array library for python, with lots of cool features like JIT compilation, heterogenous data, user defined data types, missing data support, type checking etc https://speakerdeck.com/izaid/dynd

Wonder if there is interests from either HIPS or dynd devs @izaid , @insertinterestingnamehere or @mwiebe in integrated this with Dynd (or atleast leaving in hooks for future funcitonality or cooperation later on?)

Both are cool libraries and would hate to have the python community having to choose between them for projects.
enhancement

opened by datnamer 14
Inconsistent handling of complex functions

My complex analysis is a bit rusty, and I'm getting confused by the handling of complex functions. Many functions are differentiable as complex functions (i.e. they are holomorphic) and their complex derivatives are implemented in autograd.

However there also seem to be functions, like abs, var, angle and others, which are not differentiable as complex functions, but they also have derivatives implemented. I'm assuming these derivatives treat the complex inputs as if they were 2D real valued vectors? This seems inconsistent to me.

Users can fairly easily replicate the second behaviour without these derivatives being defined, by manually decomposing their numbers into real and imaginary parts, so I would tentatively propose removing these pseudo-derivatives...

Apologies if I'm making some dumb mistake here...

opened by j-towns 13
Further correcting grad_eigh to support hermitian matrices and the UPLO kwarg properly
Edit: as discussed in the comments below, the issue with the complex eigenvectors is the gauge, which is arbitrary. However, this updated code should work for complex-valued matrices and functions that do not depend on the gauge. So for example, the test for the complex case uses np.abs(v).

What this update does:

fix the vjp computation for numpy.linalg.eigh in accordance with the behavior of the function, which always takes only the upper/lower part of the matrix

fix the tests to take random matrices as opposed to random symmetric matrices

fix the computation to work for Hermitian matrices as per this pull request, on which I've built

However:

the gradient for Hermitian matrices works only for the eigenvalues and not (always) for the eigenvectors

so I've added a test, but I take a random complex matrix and check only the eigenvalue gradient flow

the problem with eigenvectors probably has to do with their being complex; this has not been dealt with anywhere that I looked: (pyTorch, TensorFlow, or the original reference https://people.maths.ox.ac.uk/gilesm/files/NA-08-01.pdf , where real eigenvectors are also assumed to be real in the tests)

The gradient for the eigenvectors does not pass a general test. However, it works in some cases. For example, this code

import autograd.numpy as npa from autograd import grad def fn(a): # Define an array with some random operations mat = npa.array([[(1+1j)*a, 2, a], [1j*a, 2 + npa.abs(a + 1j), 1], [npa.conj(a), npa.exp(a), npa.abs(a)]]) [eigs, vs] = npa.linalg.eigh(mat) return npa.abs(vs[0, 0]) a = 2.1 + 1.1j # Some random test value # Compute the numerical gradient of fn(a) grad_num = (fn(a + 1e-5)-fn(a))/1e-5 - 1j*(fn(a + 1e-5*1j)-fn(a))/1e-5 print('Autograd gradient: ', grad(fn)(a)) print('Numerical gradient: ', grad_num) print('Difference: ', npa.linalg.norm(grad(fn)(a)-grad_num))

returns a difference smaller than 1e-6 for any individual component of vs that is put in the return statement. However, it breaks for a more complicated function, e.g. return npa.abs(vs[0, 0] + vs[1, 1]).

It would be great if someone can address this further. Still, for now this PR is a significant improvement in the behavior of the linalg.eigh function.
opened by momchilmm 12

Calling `np.array(..., np.float64)` can fail

This works:

import autograd.numpy as np
from autograd import jacobian

th = np.array([1., 2., 3., 4.])
A = lambda th: [ [th[0], th[1]], [th[2], th[3]] ]
B = lambda th: np.array(A(th), np.float64)
jacobian(B)(th)

But this does not:

import autograd.numpy as np
from autograd import jacobian

th = np.array([1., 2., 3., 4])
A = lambda th: [ [th[0], th[1]], [th[2], th[3]] ]
B = lambda f: lambda th: np.array(f(th), np.float64)
C = B(A)
jacobian(C)(th)

throwing AutogradHint: This error *might* be caused by assigning into arrays, which autograd doesn't support.

I have a list of functions such as A() that return lists. What I want to do is to wrap each of them into a function such as B() so that I can get numpy array on return. Can I achieve this another way?

opened by konstunn 12

Is it possible to see gradient function?

Hi, when I use autograd, it is possible to see its gradient function? Or in other words, it is possible to see derivative of that function? Or is it possible to see computational graph?

For example, I want to see grad_tanh function

import autograd.numpy as np  # Thinly-wrapped numpy
from autograd import grad    # The only autograd function you may ever need

def tanh(x):                 # Define a function
       y = np.exp(-2.0 * x)
       return (1.0 - y) / (1.0 + y)

grad_tanh = grad(tanh)       # Obtain its gradient function

Thank you

opened by Samuel-Bachorik 0

Gradient become Nan for 0 value test

The following code is not working: def loss(x): return np.linalg.norm(x) ** 2

x = np.zeros([3]) a = grad(loss)(x) print(a)

Error Message:

/Users/.local/lib/python3.6/site-packages/autograd/numpy/linalg.py:100: RuntimeWarning: invalid value encountered in double_scalars return expand(g / ans) * x array([nan, nan, nan])

opened by Yuhang-7 1

support for Jax-like custom forward pass definition?

Is there a way to define a custom forward pass, like in jax, where one can output a residual that may be used by the backward pass?

For example, is the following example (from the Jax docs) implementable in autograd?

from jax import custom_vjp

@custom_vjp
def f(x, y):
  return jnp.sin(x) * y

def f_fwd(x, y):
# Returns primal output and residuals to be used in backward pass by f_bwd.
  return f(x, y), (jnp.cos(x), jnp.sin(x), y)

def f_bwd(res, g):
  cos_x, sin_x, y = res # Gets residuals computed in f_fwd
  return (cos_x * g * y, sin_x * g)

f.defvjp(f_fwd, f_bwd)

opened by tylerflex 0

Evaluating a section of a jacobian
Let's say that I have some function f(x), where x is a vector and returns a vector. I can evaluate the Jacobian of this function fairly simply, as demonstrated below.

from autograd import numpy as np import autograd as ag def f(x): return np.array([x.sum(),(x[:3]**2).sum(),np.log(np.exp(x).sum())]) xtest=np.array([0,.5,.3,.2]) print(f(xtest)) print(ag.jacobian(f)(xtest))

My question is if there's some way of evaluating only some columns of this jacobian. For example, let's say I only wanted the first and last columns of it. So far I haven't found any way of evaluating this more efficiently than just evaluating the whole jacobian and throwing some away. If anyone can help please let me know!
opened by darcykenworthy 0

Bug when raising zero to powers?

Consider these two seemingly equivalent functions:

def fa(x):
    return x ** 1.5

def fb(x):
    return x * x ** 0.5

We can see that one of them is differentiated ok, while the other produces warnings and nans at zero:

>>> [ga, gb] = map(autograd.grad, [fa, fb])
>>> print(ga(2.), ga(0.))
2.121320343559643 0.0
>>> print(gb(2.), gb(0.))
.../autograd/numpy/numpy_vjps.py:59: RuntimeWarning: divide by zero encountered in power
  lambda ans, x, y : unbroadcast_f(x, lambda g: g * y * x ** anp.where(y, y - 1, 1.)),
.../autograd/numpy/numpy_vjps.py:59: RuntimeWarning: invalid value encountered in double_scalars
  lambda ans, x, y : unbroadcast_f(x, lambda g: g * y * x ** anp.where(y, y - 1, 1.)),
2.121320343559643 nan

I'm not sure whether this is a proper fix, but if in the defvjp call of anp.power we change anp.where(y, y - 1, 1.) to anp.where(x, anp.where(y, y - 1, 1.), 1.) then gb(0.) does produce the same result as ga(0.)

I concede that 0 ** -0.5 and ZeroDivisionError in general is a delicate topic. But my fix still seems consistent to me. 🤷‍♂️

opened by yairchu 0

ModuleNotFoundError

Hi，I failed to run autograd's test cases due to ModuleNotFoundError

I tried to install all the requirements files but some packages were still missing. Can you complete the requirements file or give some other sulutions.

Thanks for your help. Best, SmartPycg

opened by SmartPycg 0

Releases(1.0)

1.0(Mar 5, 2015)

Source code(tar.gz)
Source code(zip)

Owner

Formerly: Harvard Intelligent Probabilistic Systems Group -- Now at Princeton

Ryan Adams' research group. Formerly at Harvard, now at Princeton. New Github repositories here: https://github.com/PrincetonLIPS

GitHub Repository

A note taker for NVDA. Allows the user to create, edit, view, manage and export notes to different formats.

Quick Notetaker add-on for NVDA The Quick Notetaker add-on is a wonderful tool which allows writing notes quickly and easily anytime and from any app

5 Dec 06, 2022

WRENCH: Weak supeRvision bENCHmark

🔧 What is it? Wrench is a benchmark platform containing diverse weak supervision tasks. It also provides a common and easy framework for development

176 Dec 28, 2022

領域を指定し、キーを入力することで画像を保存するツールです。クラス分類用のデータセット作成を想定しています。

image-capture-class-annotation 領域を指定し、キーを入力することで画像を保存するツールです。クラス分類用のデータセット作成を想定しています。 Requirement OpenCV 3.4.2 or later Usage 実行方法は以下です。起動後はマウスクリック4

5 May 28, 2021

Neural Tangent Generalization Attacks (NTGA)

34 Nov 25, 2022

Learning Energy-Based Models by Diffusion Recovery Likelihood

Learning Energy-Based Models by Diffusion Recovery Likelihood Ruiqi Gao, Yang Song, Ben Poole, Ying Nian Wu, Diederik P. Kingma Paper: https://arxiv.o

41 Nov 22, 2022

This is the official repository of Music Playlist Title Generation: A Machine-Translation Approach.

PlyTitle_Generation This is the official repository of Music Playlist Title Generation: A Machine-Translation Approach. The paper has been accepted by

6 Jan 03, 2022

Voice control for Garry's Mod

WIP: Talonvoice GMod integrations Very work in progress voice control demo for Garry's Mod. HOWTO Install https://talonvoice.com/ Press https://i.imgu

5 Nov 15, 2022

Hidden-Fold Networks (HFN): Random Recurrent Residuals Using Sparse Supermasks

Hidden-Fold Networks (HFN): Random Recurrent Residuals Using Sparse Supermasks by Ángel López García-Arias, Masanori Hashimoto, Masato Motomura, and J

4 May 19, 2022

Code for the paper: "On the Bottleneck of Graph Neural Networks and Its Practical Implications"

On the Bottleneck of Graph Neural Networks and its Practical Implications This is the official implementation of the paper: On the Bottleneck of Graph

75 Dec 22, 2022

Task-based end-to-end model learning in stochastic optimization

Task-based End-to-end Model Learning in Stochastic Optimization This repository is by Priya L. Donti, Brandon Amos, and J. Zico Kolter and contains th

164 Dec 29, 2022

Robot Servers and Server Manager software for robo-gym

robo-gym-server-modules Robot Servers and Server Manager software for robo-gym. For info on how to use this package please visit the robo-gym website

4 Aug 16, 2021

Few-Shot-Intent-Detection includes popular challenging intent detection datasets with/without OOS queries and state-of-the-art baselines and results.

Few-Shot-Intent-Detection Few-Shot-Intent-Detection is a repository designed for few-shot intent detection with/without Out-of-Scope (OOS) intents. It

73 Dec 26, 2022

Syntax-Aware Action Targeting for Video Captioning

Syntax-Aware Action Targeting for Video Captioning Code for SAAT from "Syntax-Aware Action Targeting for Video Captioning" (Accepted to CVPR 2020). Th

59 Oct 13, 2022

这是一个facenet-pytorch的库，可以用于训练自己的人脸识别模型。

Facenet：人脸识别模型在Pytorch当中的实现目录性能情况 Performance 所需环境 Environment 注意事项 Attention 文件下载 Download 预测步骤 How2predict 训练步骤 How2train 参考资料 Reference 性能情况训练数据

210 Jan 06, 2023

Simultaneous NMT/MMT framework in PyTorch

This repository includes the codes, the experiment configurations and the scripts to prepare/download data for the Simultaneous Machine Translation wi

[email protected]"> 37 Sep 29, 2022

Totally Versatile Miscellanea for Pytorch

Totally Versatile Miscellania for PyTorch Thomas Viehmann [email protected] Thi

428 Dec 28, 2022

Unity Propagation in Bayesian Networks Handling Inconsistency via Unity Smoothing

This repository contains the scripts needed to generate the results from the paper Unity Propagation in Bayesian Networks Handling Inconsistency via U

0 Jan 19, 2022

This code is for our paper "VTGAN: Semi-supervised Retinal Image Synthesis and Disease Prediction using Vision Transformers"

ICCV Workshop 2021 VTGAN This code is for our paper "VTGAN: Semi-supervised Retinal Image Synthesis and Disease Prediction using Vision Transformers"

25 Dec 08, 2022

A PyTorch implementation of the Transformer model in "Attention is All You Need".

Attention is all you need: A Pytorch Implementation This is a PyTorch implementation of the Transformer model in "Attention is All You Need" (Ashish V

7.1k Jan 04, 2023

PyTorch Lightning + Hydra. A feature-rich template for rapid, scalable and reproducible ML experimentation with best practices. ⚡🔥⚡

Lightning-Hydra-Template A clean and scalable template to kickstart your deep learning project 🚀 ⚡ 🔥 Click on Use this template to initialize new re

2.1k Jan 09, 2023