Pytorch port of Google Research's LEAF Audio paper

Overview

leaf-audio-pytorch

Pytorch port of Google Research's LEAF Audio paper published at ICLR 2021.

This port is not completely finished, but the Leaf() frontend is fully ported over, functional and validated to have similar outputs to the original tensorflow implementation. A few small things are missing, such as the SincNet and SincNet+ implementations, a few different pooling layers, etc.

PLEASE leave issues, pull requests, comments, or anything you find in using this repository that may be of value to others who will try to use this.

Installation

From the root directory of this repo, run:

pip install -e .

Usage

leaf_audio_pytorch mirrors it's original respository; imports and arguments are the same.

import leaf_audio_pytorch.frontend as frontend

leaf = frontend.Leaf()

Installation for Developing

If you are looking to develop on this repo, the requirements.txt contains everything needed to run the torch and tf implementations of leaf audio simultaneously.

NOTE: There is some weird dependency stuff going on with the original leaf-audio repo. Seems like its a dependency issue with lingvo and waymo-open-dataset. These below commands are a workaround.

Install the packages required:

pip install -r requirements.txt --no-deps

Install the leaf-audio repo from Git SSH:

pip install git+ssh://[email protected]/google-research/leaf-audio.git --no-deps

Then add the leaf_audio_pytorch package as well

python setup.py develop

At this point everything should be good to go! The scripts in test/ contains some testing code to validate the torch implementation mirrors tf.

Some Things to Keep in Mind (PLEASE READ)

  • When writing this port, I ran a debugger of the torch and tf implementations side by side and validated that each layer and operation mirrors the tensorflow implementation (to within a few significant digits, i.e. a tensor's values may variate by 0.001). There is one notable exception: The depthwise convolution within the GaussianLowpass pooling layer has a larger variation in tensor values, but the ported operation still produces similar outputs. I'm not sure why this operation is producing different values, but i'm currently looking into it. Please do your own due diligence in using this port and making sure this works as expected.

  • As of March 29, I finished the initial version of the port, but I have not tested Leaf() in a traning setting yet. Calling .backward() on Leaf() throws no errors, meaning backprop works as expected. However, I do not yet know how this will function during training.

  • As PyTorch and Tensorflow follow different tensor ordering conventions, Leaf() does all of its operations and outputs tensors with channels first.

Reference

All credit and attribution goes to Neil Zeghidour and the Google Research team who wrote the paper and created the Tensorflow implementation.

Please visit their GitHub repository and review their ICLR publication.

Owner
Dennis Fedorishin
UB | Computer Science PhD Candidate
Dennis Fedorishin
Constraint-based geometry sketcher for blender

Constraint-based sketcher addon for Blender that allows to create precise 2d shapes by defining a set of geometric constraints like tangent, distance,

1.7k Dec 31, 2022
PyTorch trainer and model for Sequence Classification

PyTorch-trainer-and-model-for-Sequence-Classification After cloning the repository, modify your training data so that the training data is a .csv file

NhanTieu 2 Dec 09, 2022
Official code for UnICORNN (ICML 2021)

UnICORNN (Undamped Independent Controlled Oscillatory RNN) [ICML 2021] This repository contains the implementation to reproduce the numerical experime

Konstantin Rusch 21 Dec 22, 2022
Code repository for "Reducing Underflow in Mixed Precision Training by Gradient Scaling" presented at IJCAI '20

Reducing Underflow in Mixed Precision Training by Gradient Scaling This project implements the gradient scaling method to improve the performance of m

Ruizhe Zhao 5 Apr 14, 2022
Cmsc11 arcade - Final Project for CMSC11

cmsc11_arcade Final Project for CMSC11 Developers: Limson, Mark Vincent Peñafiel

Gregory 1 Jan 18, 2022
ByteTrack(Multi-Object Tracking by Associating Every Detection Box)のPythonでのONNX推論サンプル

ByteTrack-ONNX-Sample ByteTrack(Multi-Object Tracking by Associating Every Detection Box)のPythonでのONNX推論サンプルです。 ONNXに変換したモデルも同梱しています。 変換自体を試したい方はByteT

KazuhitoTakahashi 16 Oct 26, 2022
Amazing-Python-Scripts - 🚀 Curated collection of Amazing Python scripts from Basics to Advance with automation task scripts.

📑 Introduction A curated collection of Amazing Python scripts from Basics to Advance with automation task scripts. This is your Personal space to fin

Avinash Ranjan 1.1k Dec 29, 2022
Official Pytorch implementation for 2021 ICCV paper "Learning Motion Priors for 4D Human Body Capture in 3D Scenes" and trained models / data

Learning Motion Priors for 4D Human Body Capture in 3D Scenes (LEMO) Official Pytorch implementation for 2021 ICCV (oral) paper "Learning Motion Prior

165 Dec 19, 2022
[arXiv22] Disentangled Representation Learning for Text-Video Retrieval

Disentangled Representation Learning for Text-Video Retrieval This is a PyTorch implementation of the paper Disentangled Representation Learning for T

Qiang Wang 49 Dec 18, 2022
Code of PVTv2 is released! PVTv2 largely improves PVTv1 and works better than Swin Transformer with ImageNet-1K pre-training.

Updates (2020/06/21) Code of PVTv2 is released! PVTv2 largely improves PVTv1 and works better than Swin Transformer with ImageNet-1K pre-training. Pyr

1.3k Jan 04, 2023
This is the official implementation of 3D-CVF: Generating Joint Camera and LiDAR Features Using Cross-View Spatial Feature Fusion for 3D Object Detection, built on SECOND.

3D-CVF This is the official implementation of 3D-CVF: Generating Joint Camera and LiDAR Features Using Cross-View Spatial Feature Fusion for 3D Object

YecheolKim 97 Dec 20, 2022
data/code repository of "C2F-FWN: Coarse-to-Fine Flow Warping Network for Spatial-Temporal Consistent Motion Transfer"

C2F-FWN data/code repository of "C2F-FWN: Coarse-to-Fine Flow Warping Network for Spatial-Temporal Consistent Motion Transfer" (https://arxiv.org/abs/

EKILI 46 Dec 14, 2022
Supporting code for the Neograd algorithm

Neograd This repo supports the paper Neograd: Gradient Descent with a Near-Ideal Learning Rate, which introduces the algorithm "Neograd". The paper an

Michael Zimmer 12 May 01, 2022
KaziText is a tool for modelling common human errors.

KaziText KaziText is a tool for modelling common human errors. It estimates probabilities of individual error types (so called aspects) from grammatic

ÚFAL 3 Nov 24, 2022
Basit bir burç modülü.

Bu modulu burclar hakkinda gundelik bir sekilde bilgi alin diye yaptim ve sizler icin kullanima sunuyorum. Modulun kullanimi asiri basit: Ornek Kullan

Special 17 Jun 08, 2022
In generative deep geometry learning, we often get many obj files remain to be rendered

a python prompt cli script for blender batch render In deep generative geometry learning, we always get many .obj files to be rendered. Our rendered i

Tian-yi Liang 1 Mar 20, 2022
Classifying audio using Wavelet transform and deep learning

Audio Classification using Wavelet Transform and Deep Learning A step-by-step tutorial to classify audio signals using continuous wavelet transform (C

Aditya Dutt 17 Nov 29, 2022
Pytorch and Torch testing code of CartoonGAN

CartoonGAN-Test-Pytorch-Torch Pytorch and Torch testing code of CartoonGAN [Chen et al., CVPR18]. With the released pretrained models by the authors,

Yijun Li 642 Dec 27, 2022
Combine Tacotron2 and Hifi GAN to generate speech from text

EndToEndTextToSpeech Combine Tacotron2 and Hifi GAN to generate speech from text Download weights Hifi GAN - hifi_gan/checkpoint/ : pretrain 2.5M ste

Phạm Quốc Huy 1 Dec 18, 2021
Unofficial implementation of the Involution operation from CVPR 2021

involution_pytorch Unofficial PyTorch implementation of "Involution: Inverting the Inherence of Convolution for Visual Recognition" by Li et al. prese

Rishabh Anand 46 Dec 07, 2022