Official implementation of FCL-taco2: Fast, Controllable and Lightweight version of Tacotron2 @ ICASSP 2021

Last update: Sep 28, 2022

Overview

FCL-Taco2: Towards Fast, Controllable and Lightweight Text-to-Speech synthesis (ICASSP 2021) Paper | Demo

Block diagram of FCL-taco2, where the decoder generates mel-spectrograms in AR mode within each phoneme and is shared for all phonemes.

💬 Huawei Noah's Ark Lab is recruiting interns on speech processing fields, if you're interested, you're welcome to contact Dr. Deng: [email protected]

Training and inference scripts for FCL-taco2

Environment

python 3.6.10
torch 1.3.1
chainer 6.0.0
espnet 8.0.0
apex 0.1
numpy 1.19.1
kaldiio 2.15.1
librosa 0.8.0

Training and inference:

Step1. Data preparation & preprocessing

Download LJSpeech
Unpack downloaded LJSpeech-1.1.tar.bz2 to /xx/LJSpeech-1.1
Obtain the forced alignment information by using Montreal forced aligner tool. Or you can download our alignment results, then unpack it to /xx/TextGrid
Preprocess the dataset to extract mel-spectrograms, phoneme duration, pitch, energy and phoneme sequence by:
```
 python preprocessing.py --data-root /xx/LJSpeech-1.1 --textgrid-root /xx/TextGrid
```

Step2. Model training

Training teacher model FCL-taco2-T:
```
 ./teacher_model_training.sh
```
Training student model FCL-taco2-S:
```
 ./student_model_training.sh
```
Parallel-WaveGAN vocoder training: follow instructions at here. You can also download the pre-trained PWG vocoder, and put the PWG model under the directory "vocoder".

Step3. Model evaluation

FCL-taco2-T evaluation:
```
 ./inference_teacher.sh
```
FCL-taco2-S evaluation:
```
 ./inference_student.sh
```

Citation

If the code is used in your research, please star our repo and cite our paper:

@inproceedings{wang2021fcl,
  title={Fcl-Taco2: Towards Fast, Controllable and Lightweight Text-to-Speech Synthesis},
  author={Wang, Disong and Deng, Liqun and Zhang, Yang and Zheng, Nianzu and Yeung, Yu Ting and Chen, Xiao and Liu, Xunying and Meng, Helen},
  booktitle={ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
  pages={5714--5718},
  year={2021},
  organization={IEEE}
}

Official implementation of FCL-taco2: Fast, Controllable and Lightweight version of Tacotron2 @ ICASSP 2021

Related tags

Overview

FCL-Taco2: Towards Fast, Controllable and Lightweight Text-to-Speech synthesis (ICASSP 2021) Paper | Demo

💬 Huawei Noah's Ark Lab is recruiting interns on speech processing fields, if you're interested, you're welcome to contact Dr. Deng: [email protected]

Training and inference scripts for FCL-taco2

Environment

Training and inference:

Citation

Owner

Disong Wang

A toolkit for document-level event extraction, containing some SOTA model implementations

Python library containing BART query generation and BERT-based Siamese models for neural retrieval.

Notes taking website build with Docker + Django + React.

Unified Interface for Constructing and Managing Workflows on different workflow engines, such as Argo Workflows, Tekton Pipelines, and Apache Airflow.

Unofficial implementation of MLP-Mixer: An all-MLP Architecture for Vision

WiFi-based Multi-task Sensing

[ICML 2020] Prediction-Guided Multi-Objective Reinforcement Learning for Continuous Robot Control

Pytorch implementation of ProjectedGAN

[ICCV 2021] Official PyTorch implementation for Deep Relational Metric Learning.

🔥 Cogitare - A Modern, Fast, and Modular Deep Learning and Machine Learning framework for Python

CC-GENERATOR - A python script for generating CC

PyTorch implementation of paper “Unbiased Scene Graph Generation from Biased Training”

code for paper -- "Seamless Satellite-image Synthesis"

Here I will explain the flow to deploy your custom deep learning models on Ultra96V2.

Epidemiology analysis package

Pytorch implementation of VAEs for heterogeneous likelihoods.

This project is a re-implementation of MASTER: Multi-Aspect Non-local Network for Scene Text Recognition by MMOCR

Code release for Universal Domain Adaptation(CVPR 2019)

I decide to sync up this repo and self-critical.pytorch. (The old master is in old master branch for archive)

EZ graph is an easy to use AI solution that allows you to make and train your neural networks without a single line of code.