The official implementation of the Interspeech 2021 paper WSRGlow: A Glow-based Waveform Generative Model for Audio Super-Resolution.

Last update: Jan 03, 2023

Related tags

Deep Learning WSRGlow

Overview

WSRGlow

The official implementation of the Interspeech 2021 paper WSRGlow: A Glow-based Waveform Generative Model for Audio Super-Resolution. Audio samples can be found here.

Feel free to create issues or send an email to [email protected] if you have problems running the code.

Before running the code, you need to install the dependicies by pip install -r requirements.txt.

The configs for model architecture and training scheme is saved in config.yaml. You can overwrite some of the attributes by adding the --hparams flag when running a command. The general way to run a python script is

python $SRC$ --config $CONFIG$ --hparams $KEY1$=$VALUE1$,$KEY2$=$VALUE2$,...

See hparams.py for more details.

To prepare data

Before training, you need to binarize the data first. The raw wav files should be put in the hparams['raw_data_path']. The binarized data would be put in the hparams['binary_data_path'].

Specifically, for the VCTK corpus, the file structure should be like

.
|--data
    |--raw
        |--VCTK-Corpus
            |--wav48
                |--$WAVS
|--checkpoints
    |--wsrglow

where the model checkpoints are in checkpoints/wsrglow.

The command to binarize is

python binarizer.py --config config.yaml

To modify the architecture of the model

The current WSRGlow model in model.py is designed for x4 super-resolution and takes waveform, spectrogram and phase information as input.

To train

Run python train.py --config config.yaml on a GPU.

To infer

Change the code in infer.py to specify the checkpoint you want to load and the sample inputs you want to use for inference. Run python infer.py --config config.yaml on a GPU, modify the code for the correct path of checkpoints and wav files.

The official implementation of the Interspeech 2021 paper WSRGlow: A Glow-based Waveform Generative Model for Audio Super-Resolution.

Related tags

Overview

WSRGlow

To prepare data

To modify the architecture of the model

To train

To infer

Owner

Kexun Zhang

FluxTraining.jl gives you an endlessly extensible training loop for deep learning

Frequency Spectrum Augmentation Consistency for Domain Adaptive Object Detection

⚾🤖⚾ Automatic baseball pitching overlay in realtime

MSG-Transformer: Exchanging Local Spatial Information by Manipulating Messenger Tokens

Official implementation of Representer Point Selection via Local Jacobian Expansion for Post-hoc Classifier Explanation of Deep Neural Networks and Ensemble Models at NeurIPS 2021

Multi Task RL Baselines

YOLOX + ROS(1, 2) object detection package

A hybrid SOTA solution of LiDAR panoptic segmentation with C++ implementations of point cloud clustering algorithms. ICCV21, Workshop on Traditional Computer Vision in the Age of Deep Learning

Code for "Causal autoregressive flows" - AISTATS, 2021

TensorFlow-LiveLessons - "Deep Learning with TensorFlow" LiveLessons

Learning Chinese Character style with conditional GAN

YouRefIt: Embodied Reference Understanding with Language and Gesture

Global Pooling, More than Meets the Eye: Position Information is Encoded Channel-Wise in CNNs, ICCV 2021

Official PyTorch implementation of Retrieve in Style: Unsupervised Facial Feature Transfer and Retrieval.

Pytorch code for paper "Image Compressed Sensing Using Non-local Neural Network" TMM 2021.

Facilitates implementing deep neural-network backbones, data augmentations

Gas detection for Raspberry Pi using ADS1x15 and MQ-2 sensors

Traductor de lengua de señas al español basado en Python con Opencv y MedaiPipe

PyTorch implementation of ''Background Activation Suppression for Weakly Supervised Object Localization''.

The Simplest DCGAN Implementation