The official implementation of the Interspeech 2021 paper WSRGlow: A Glow-based Waveform Generative Model for Audio Super-Resolution.

Related tags

Deep LearningWSRGlow
Overview

WSRGlow

The official implementation of the Interspeech 2021 paper WSRGlow: A Glow-based Waveform Generative Model for Audio Super-Resolution. Audio samples can be found here.

Feel free to create issues or send an email to [email protected] if you have problems running the code.

Before running the code, you need to install the dependicies by pip install -r requirements.txt.

The configs for model architecture and training scheme is saved in config.yaml. You can overwrite some of the attributes by adding the --hparams flag when running a command. The general way to run a python script is

python $SRC$ --config $CONFIG$ --hparams $KEY1$=$VALUE1$,$KEY2$=$VALUE2$,...

See hparams.py for more details.

To prepare data

Before training, you need to binarize the data first. The raw wav files should be put in the hparams['raw_data_path']. The binarized data would be put in the hparams['binary_data_path'].

Specifically, for the VCTK corpus, the file structure should be like

.
|--data
    |--raw
        |--VCTK-Corpus
            |--wav48
                |--$WAVS
|--checkpoints
    |--wsrglow
    

where the model checkpoints are in checkpoints/wsrglow.

The command to binarize is

python binarizer.py --config config.yaml

To modify the architecture of the model

The current WSRGlow model in model.py is designed for x4 super-resolution and takes waveform, spectrogram and phase information as input.

To train

Run python train.py --config config.yaml on a GPU.

To infer

Change the code in infer.py to specify the checkpoint you want to load and the sample inputs you want to use for inference. Run python infer.py --config config.yaml on a GPU, modify the code for the correct path of checkpoints and wav files.

Owner
Kexun Zhang
Interested in linguistics. Former participant in programming contests.
Kexun Zhang
Jupyter Dock is a set of Jupyter Notebooks for performing molecular docking protocols interactively, as well as visualizing, converting file formats and analyzing the results.

Molecular Docking integrated in Jupyter Notebooks Description | Citation | Installation | Examples | Limitations | License Table of content Descriptio

Angel J. Ruiz Moreno 173 Dec 25, 2022
Distributed Deep learning with Keras & Spark

Elephas: Distributed Deep Learning with Keras & Spark Elephas is an extension of Keras, which allows you to run distributed deep learning models at sc

Max Pumperla 1.6k Jan 05, 2023
Use AI to generate a optimized stock portfolio

Use AI, Modern Portfolio Theory, and Monte Carlo simulation's to generate a optimized stock portfolio that minimizes risk while maximizing returns. Ho

Greg James 30 Dec 22, 2022
This repository contains a Ruby API for utilizing TensorFlow.

tensorflow.rb Description This repository contains a Ruby API for utilizing TensorFlow. Linux CPU Linux GPU PIP Mac OS CPU Not Configured Not Configur

somatic labs 825 Dec 26, 2022
Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr\"om Method (NeurIPS 2021)

Skyformer This repository is the official implementation of Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr"om Method (NeurIPS 2021).

Qi Zeng 46 Sep 20, 2022
PyTorch implementation for STIN

STIN This repository contains PyTorch implementation for STIN. Abstract: In single-photon LiDAR, photon-efficient imaging captures the 3D structure of

Yiweins 2 Nov 22, 2022
Python project to take sound as input and output as RGB + Brightness values suitable for DMX

sound-to-light Python project to take sound as input and output as RGB + Brightness values suitable for DMX Current goals: Get one pixel working: Vary

Bobby Cox 1 Nov 17, 2021
RNN Predict Street Commercial Vitality

RNN-for-Predicting-Street-Vitality Code and dataset for Predicting the Vitality of Stores along the Street based on Business Type Sequence via Recurre

Zidong LIU 1 Dec 15, 2021
A tool for calculating distortion parameters in coordination complexes.

OctaDist Octahedral distortion calculator: A tool for calculating distortion parameters in coordination complexes. https://octadist.github.io/ Registe

OctaDist 12 Oct 04, 2022
ConformalLayers: A non-linear sequential neural network with associative layers

ConformalLayers: A non-linear sequential neural network with associative layers ConformalLayers is a conformal embedding of sequential layers of Convo

Prograf-UFF 5 Sep 28, 2022
Car Price Predictor App used to predict the price of the car based on certain input parameters created using python's scikit-learn, fastapi, numpy and joblib packages.

Pricefy Car Price Predictor App used to predict the price of the car based on certain input parameters created using python's scikit-learn, fastapi, n

Siva Prakash 1 May 10, 2022
Code for "Unsupervised State Representation Learning in Atari"

Unsupervised State Representation Learning in Atari Ankesh Anand*, Evan Racah*, Sherjil Ozair*, Yoshua Bengio, Marc-Alexandre Côté, R Devon Hjelm This

Mila 217 Jan 03, 2023
A custom-designed Spider Robot trained to walk using Deep RL in a PyBullet Simulation

SpiderBot_DeepRL Title: Implementation of Single and Multi-Agent Deep Reinforcement Learning Algorithms for a Walking Spider Robot Authors(s): Arijit

Arijit Dasgupta 9 Jul 28, 2022
Over-the-Air Ensemble Inference with Model Privacy

Over-the-Air Ensemble Inference with Model Privacy This repository contains simulations for our private ensemble inference method. Installation Instal

Selim Firat Yilmaz 1 Jun 29, 2022
Few-shot NLP benchmark for unified, rigorous eval

FLEX FLEX is a benchmark and framework for unified, rigorous few-shot NLP evaluation. FLEX enables: First-class NLP support Support for meta-training

AI2 85 Dec 03, 2022
Soft actor-critic is a deep reinforcement learning framework for training maximum entropy policies in continuous domains.

This repository is no longer maintained. Please use our new Softlearning package instead. Soft Actor-Critic Soft actor-critic is a deep reinforcement

Tuomas Haarnoja 752 Jan 07, 2023
Investigating Attention Mechanism in 3D Point Cloud Object Detection (arXiv 2021)

Investigating Attention Mechanism in 3D Point Cloud Object Detection (arXiv 2021) This repository is for the following paper: "Investigating Attention

52 Nov 19, 2022
CC-GENERATOR - A python script for generating CC

CC-GENERATOR A python script for generating CC NOTE: This tool is for Educationa

Lêkzï 6 Oct 14, 2022
Probabilistic-Monocular-3D-Human-Pose-Estimation-with-Normalizing-Flows

Probabilistic-Monocular-3D-Human-Pose-Estimation-with-Normalizing-Flows This is the official implementation of the ICCV 2021 Paper "Probabilistic Mono

62 Nov 23, 2022
Sequence modeling benchmarks and temporal convolutional networks

Sequence Modeling Benchmarks and Temporal Convolutional Networks (TCN) This repository contains the experiments done in the work An Empirical Evaluati

CMU Locus Lab 3.5k Jan 01, 2023