AdaSpeech 2: Adaptive Text to Speech with Untranscribed Data

Overview

AdaSpeech 2: Adaptive Text to Speech with Untranscribed Data [WIP]

Unofficial Pytorch implementation of AdaSpeech 2.

Requirements :

All code written in Python 3.6.2 .

  • Install Pytorch

Before installing pytorch please check your Cuda version by running following command : nvcc --version

pip install torch torchvision

In this repo I have used Pytorch 1.6.0 for torch.bucketize feature which is not present in previous versions of PyTorch.

  • Installing other requirements :
pip install -r requirements.txt
  • To use Tensorboard install tensorboard version 1.14.0 seperatly with supported tensorflow (1.14.0)

For Preprocessing :

filelists folder contains MFA (Motreal Force aligner) processed LJSpeech dataset files so you don't need to align text with audio (for extract duration) for LJSpeech dataset. For other dataset follow instruction here. For other pre-processing run following command :

python nvidia_preprocessing.py -d path_of_wavs

For finding the min and max of F0 and Energy

python compute_statistics.py

Update the following in hparams.py by min and max of F0 and Energy

p_min = Min F0/pitch
p_max = Max F0
e_min = Min energy
e_max = Max energy

Training :

[WIP]

Citations :

@misc{chen2021adaspeech,
      title={AdaSpeech: Adaptive Text to Speech for Custom Voice}, 
      author={Mingjian Chen and Xu Tan and Bohan Li and Yanqing Liu and Tao Qin and Sheng Zhao and Tie-Yan Liu},
      year={2021},
      eprint={2103.00993},
      archivePrefix={arXiv},
      primaryClass={eess.AS}
}
@misc{yan2021adaspeech,
      title={AdaSpeech 2: Adaptive Text to Speech with Untranscribed Data}, 
      author={Yuzi Yan and Xu Tan and Bohan Li and Tao Qin and Sheng Zhao and Yuan Shen and Tie-Yan Liu},
      year={2021},
      eprint={2104.09715},
      archivePrefix={arXiv},
      primaryClass={cs.SD}
}
Owner
Rishikesh (ऋषिकेश)
Deep Learning/ AI Researcher | Open Source enthusiast | Text to Speech | Speech Synthesis | Generative Models | Object detection | Language Understanding
Rishikesh (ऋषिकेश)
Adapter-BERT: Parameter-Efficient Transfer Learning for NLP.

Adapter-BERT: Parameter-Efficient Transfer Learning for NLP.

Google Research 340 Jan 03, 2023
Generalized Data Weighting via Class-level Gradient Manipulation

Generalized Data Weighting via Class-level Gradient Manipulation This repository is the official implementation of Generalized Data Weighting via Clas

18 Nov 12, 2022
Official implementation of NeurIPS'21: Implicit SVD for Graph Representation Learning

isvd Official implementation of NeurIPS'21: Implicit SVD for Graph Representation Learning If you find this code useful, you may cite us as: @inprocee

Sami Abu-El-Haija 16 Jan 08, 2023
Code for "Human Pose Regression with Residual Log-likelihood Estimation", ICCV 2021 Oral

Human Pose Regression with Residual Log-likelihood Estimation [Paper] [arXiv] [Project Page] Human Pose Regression with Residual Log-likelihood Estima

JeffLi 347 Dec 24, 2022
Competitive Programming Club, Clinify's Official repository for CP problems hosting by club members.

Clinify-CPC_Programs This repository holds the record of the competitive programming club where the competitive coding aspirants are thriving hard and

Clinify Open Sauce 4 Aug 22, 2022
Dilated RNNs in pytorch

PyTorch Dilated Recurrent Neural Networks PyTorch implementation of Dilated Recurrent Neural Networks (DilatedRNN). Getting Started Installation: $ pi

Zalando Research 200 Nov 17, 2022
FastFCN: Rethinking Dilated Convolution in the Backbone for Semantic Segmentation.

FastFCN: Rethinking Dilated Convolution in the Backbone for Semantic Segmentation [Project] [Paper] [arXiv] [Home] Official implementation of FastFCN:

Wu Huikai 815 Dec 29, 2022
Mail classification with tensorflow and MS Exchange Server (ham or spam).

Mail classification with tensorflow and MS Exchange Server (ham or spam).

Metin Karatas 1 Sep 11, 2021
Code implementation of "Sparsity Probe: Analysis tool for Deep Learning Models"

Sparsity Probe: Analysis tool for Deep Learning Models This repository is a limited implementation of Sparsity Probe: Analysis tool for Deep Learning

3 Jun 09, 2021
The official implementation of the Interspeech 2021 paper WSRGlow: A Glow-based Waveform Generative Model for Audio Super-Resolution.

WSRGlow The official implementation of the Interspeech 2021 paper WSRGlow: A Glow-based Waveform Generative Model for Audio Super-Resolution. Audio sa

Kexun Zhang 96 Jan 03, 2023
Hybrid Neural Fusion for Full-frame Video Stabilization

FuSta: Hybrid Neural Fusion for Full-frame Video Stabilization Project Page | Video | Paper | Google Colab Setup Setup environment for [Yu and Ramamoo

Yu-Lun Liu 430 Jan 04, 2023
Code for our ALiBi method for transformer language models.

Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation This repository contains the code and models for our paper Tra

Ofir Press 211 Dec 31, 2022
Repo for CReST: A Class-Rebalancing Self-Training Framework for Imbalanced Semi-Supervised Learning

CReST in Tensorflow 2 Code for the paper: "CReST: A Class-Rebalancing Self-Training Framework for Imbalanced Semi-Supervised Learning" by Chen Wei, Ki

Google Research 75 Nov 01, 2022
Dungeons and Dragons randomized content generator

Component based Dungeons and Dragons generator Supports Entity/Monster Generation NPC Generation Weapon Generation Encounter Generation Environment Ge

Zac 3 Dec 04, 2021
Nested cross-validation is necessary to avoid biased model performance in embedded feature selection in high-dimensional data with tiny sample sizes

Pruner for nested cross-validation - Sphinx-Doc Nested cross-validation is necessary to avoid biased model performance in embedded feature selection i

1 Dec 15, 2021
This repo provides the base code for pytorch-lightning and weight and biases simultaneous integration.

Write your model faster with pytorch-lightning-wadb-code-backbone This repository provides the base code for pytorch-lightning and weight and biases s

9 Mar 29, 2022
Evaluation toolkit of the informative tracking benchmark comprising 9 scenarios, 180 diverse videos, and new challenges.

Informative-tracking-benchmark Informative tracking benchmark (ITB) higher diversity. It contains 9 representative scenarios and 180 diverse videos. m

Xin Li 15 Nov 26, 2022
Real-time analysis of intracranial neurophysiology recordings.

py_neuromodulation Click this button to run the "Tutorial ML with py_neuro" notebooks: The py_neuromodulation toolbox allows for real time capable pro

Interventional Cognitive Neuromodulation - Neumann Lab Berlin 15 Nov 03, 2022
[NeurIPS 2020] Semi-Supervision (Unlabeled Data) & Self-Supervision Improve Class-Imbalanced / Long-Tailed Learning

Rethinking the Value of Labels for Improving Class-Imbalanced Learning This repository contains the implementation code for paper: Rethinking the Valu

Yuzhe Yang 656 Dec 28, 2022
[AAAI2021] The source code for our paper 《Enhancing Unsupervised Video Representation Learning by Decoupling the Scene and the Motion》.

DSM The source code for paper Enhancing Unsupervised Video Representation Learning by Decoupling the Scene and the Motion Project Website; Datasets li

Jinpeng Wang 114 Oct 16, 2022