EfficientTTS

Unofficial Pytorch implementation of "EfficientTTS: An Efficient and High-Quality Text-to-Speech Architecture"(arXiv).

Disclaimer: Somebody mistakenly think I'm one of the authors. In fact, I am not even in the author list of this paper. I am just a TTS enthusiast. Some important information of the implementation is not presented by the paper. Some model parameters in current version is based on my understanding and exepriments, which may not be consistent with those used by the authors.

Updates

2020/12/23: Mandarin Chinese Samples uploaded. The experiment setting is exactly the same with the LJSpeech example. A complete description of the usage will be soon uploaded.

2020/12/20: Using the HifiGAN finetuned with Tacotron2 GTA mel spectrograms can increase the quality of the generated samples, please see the newly generated-samples

Current status

Implementation of EFTS-CNN + HifiGAN

Setup with virtualenv

$ cd tools
$ make
# If you want to use distributed training, please run following
# command to install apex.
$ make apex

Note: If you want to specify Python version, CUDA version or PyTorch version, please run for example:

$ make PYTHON=3.7 CUDA_VERSION=10.1 PYTORCH_VERSION=1.6

Training

Please go to egs/lj folder, and see run.sh for example use.

Acknowledgement

The code framework is from https://github.com/kan-bayashi/ParallelWaveGAN

Pytorch implementation of

Related tags

Overview

EfficientTTS

Unofficial Pytorch implementation of "EfficientTTS: An Efficient and High-Quality Text-to-Speech Architecture"(arXiv).

Updates

Current status

Setup with virtualenv

Training

Acknowledgement

Owner

Liu Songxiang

This repository is a series of notebooks that show solutions for the projects at Dataquest.io.

Starter Code for VALUE benchmark

Semantic Segmentation Architectures Implemented in PyTorch

9th place solution

Additional code for Stable-baselines3 to load and upload models from the Hub.

Official Implementation of "Third Time's the Charm? Image and Video Editing with StyleGAN3" https://arxiv.org/abs/2201.13433

Official PyTorch implementation of Learning Intra-Batch Connections for Deep Metric Learning (ICML 2021) published at International Conference on Machine Learning

Improved Fitness Optimization Landscapes for Sequence Design

This repository contains code for the paper "Decoupling Representation and Classifier for Long-Tailed Recognition", published at ICLR 2020

For auto aligning, cropping, and scaling HR and LR images for training image based neural networks

Where-Got-Time - An NUS timetable generator which uses a genetic algorithm to optimise timetables to suit the needs of NUS students

A collection of resources and papers on Diffusion Models, a darkhorse in the field of Generative Models

Learning-Augmented Dynamic Power Management

Simply enable or disable your Nvidia dGPU

PyTorch code for MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning

Code for this paper The Lottery Ticket Hypothesis for Pre-trained BERT Networks.

Official implementation of cosformer-attention in cosFormer: Rethinking Softmax in Attention

Regularizing Generative Adversarial Networks under Limited Data (CVPR 2021)

Python TFLite scripts for detecting objects of any class in an image without knowing their label.

Code for ICLR 2020 paper "VL-BERT: Pre-training of Generic Visual-Linguistic Representations".