LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search

Last update: Dec 03, 2022

Overview

LightSpeech

UnOfficial PyTorch implementation of LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search. This repo uses the FastSpeech 2 implementation of Espnet as a base. This repo only implements the final version of LightSpeech model not the Neural Architecture Search as mentioned in paper.

But I am able to compress only 3x (from 27 M to 7.99 M trainable parameters) not 15x.

Requirements :

All code written in Python 3.6.2 .

Install Pytorch

Before installing pytorch please check your Cuda version by running following command : nvcc --version

pip install torch torchvision

In this repo I have used Pytorch 1.6.0 for torch.bucketize feature which is not present in previous versions of PyTorch.

Installing other requirements :

pip install -r requirements.txt

To use Tensorboard install tensorboard version 1.14.0 seperatly with supported tensorflow (1.14.0)

For Preprocessing :

filelists folder contains MFA (Motreal Force aligner) processed LJSpeech dataset files so you don't need to align text with audio (for extract duration) for LJSpeech dataset. For other dataset follow instruction here. For other pre-processing run following command :

python .\nvidia_preprocessing.py -d path_of_wavs -c configs/default.yaml

For finding the min and max of F0 and Energy

python .\compute_statistics.py

Update the following in hparams.py by min and max of F0 and Energy

p_min = Min F0/pitch
p_max = Max F0
e_min = Min energy
e_max = Max energy

For training

 python train_lightspeech.py --outdir etc -c configs/default.yaml -n "name"

For inference

WIP

python .\inference.py -c .\configs\default.yaml -p .\checkpoints\first_1\xyz.pyt --out output --text "ModuleList can be indexed like a regular Python list but modules it contains are properly registered."

For TorchScript Export

python export_torchscript.py -c configs/default.yaml -n fastspeech_scrip --outdir etc

Checkpoint and samples:

WIP

LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search

Related tags

Overview

LightSpeech

Requirements :

For Preprocessing :

For training

For inference

For TorchScript Export

Checkpoint and samples:

References

Owner

Rishikesh (ऋषिकेश)

Built for cleaning purposes in military institutions

Deploying a Text Summarization NLP use case on Docker Container Utilizing Nvidia GPU

Semantic search through a vectorized Wikipedia (SentenceBERT) with the Weaviate vector search engine

🌸 fastText + Bloom embeddings for compact, full-coverage vectors with spaCy

Application to help find best train itinerary, uses speech to text, has a spam filter to segregate invalid inputs, NLP and Pathfinding algos.

Gpt2-WebAPI - The objective of this API is to provide the 3 best possible responses to sentences that the user would input via http GET request as a parameter

a CTF web challenge about making screenshots

A high-level Python library for Quantum Natural Language Processing

中文生成式预训练模型

NLP and Text Generation Experiments in TensorFlow 2.x / 1.x

Accurately generate all possible forms of an English word e.g "election" --> "elect", "electoral", "electorate" etc.

GNES enables large-scale index and semantic search for text-to-text, image-to-image, video-to-video and any-to-any content form

Chatbot for the Chatango messaging platform

REST API for sentence tokenization and embedding using Multilingual Universal Sentence Encoder.

Source code for AAAI20 "Generating Persona Consistent Dialogues by Exploiting Natural Language Inference".

Deduplication is the task to combine different representations of the same real world entity.

Code for the Findings of NAACL 2022(Long Paper): AdapterBias: Parameter-efficient Token-dependent Representation Shift for Adapters in NLP Tasks

Asr abc - Automatic speech recognition(ASR),中文语音识别

What are the best Systems? New Perspectives on NLP Benchmarking

Facilitating the design, comparison and sharing of deep text matching models.