SC-GlowTTS: an Efficient Zero-Shot Multi-Speaker Text-To-Speech Model

Overview

SC-GlowTTS: an Efficient Zero-Shot Multi-Speaker Text-To-Speech Model

Edresson Casanova, Christopher Shulby, Eren Gölge, Nicolas Michael Müller, Frederico Santos de Oliveira, Arnaldo Candido Junior, Anderson da Silva Soares, Sandra Maria Aluisio, Moacir Antonelli Ponti

In our recent paper we propose SC-GlowTTS: an efficient zero-shot multi-speaker text-to-speech model that improves similarity for speakers unseen in training. We propose a speaker conditional architecture that explores a flow-based decoder that can work in a zero-shot scenario. As text encoders, we explored a dilated residual convolutional-based encoder, gated convolutional-based encoder, and transformer-based encoder. Additionally, we have shown that adjusting a GAN-based vocoder for the spectrograms predicted by the TTS model on the training dataset can significantly improve the similarity and speech quality for new speakers. We showed that our model can converge in training, using only 11 speakers, reaching state-of-the-art results for similarity with new speakers and speech quality.

Audios samples

Visit our website for audio samples.

Implementation

All of our experiments were implemented at Coqui TTS.

Checkpoints

Model URL
Speaker Encoder by @mueller91 link
Tacotron 2 link
SC-GlowTTS-Trans link
SC-GlowTTS-Res link
SC-GlowTTS-Gated link
SC-GlowTTS-Trans 11 speakers link
HiFi-GAN link
All checkpoints link

Colab demos

SC-GlowTTS-Trans

SC-GlowTTS-Res

SC-GlowTTS-Gated

SC-GlowTTS-Trans trained with 11 speakers

Preprocessed datasets

VCTK Removed Silences

MOS details

MOS Sentences

MOS samples

Owner
Edresson Casanova
Computer Science PhD Student
Edresson Casanova
Efficiently Disentangle Causal Representations

Efficiently Disentangle Causal Representations Install dependency pip install -r requirements.txt Main experiments Causality direction prediction cd

4 Apr 01, 2022
HyperaPy: An automatic hyperparameter optimization framework ⚡🚀

hyperpy HyperPy: An automatic hyperparameter optimization framework Description HyperPy: Library for automatic hyperparameter optimization. Build on t

Sergio Mora 7 Sep 06, 2022
Multivariate Time Series Forecasting with efficient Transformers. Code for the paper "Long-Range Transformers for Dynamic Spatiotemporal Forecasting."

Spacetimeformer Multivariate Forecasting This repository contains the code for the paper, "Long-Range Transformers for Dynamic Spatiotemporal Forecast

QData 440 Jan 02, 2023
SelfRemaster: SSL Speech Restoration

SelfRemaster: Self-Supervised Speech Restoration Official implementation of SelfRemaster: Self-Supervised Speech Restoration with Analysis-by-Synthesi

Takaaki Saeki 46 Jan 07, 2023
Code for the SIGGRAPH 2022 paper "DeltaConv: Anisotropic Operators for Geometric Deep Learning on Point Clouds."

DeltaConv [Paper] [Project page] Code for the SIGGRAPH 2022 paper "DeltaConv: Anisotropic Operators for Geometric Deep Learning on Point Clouds" by Ru

98 Nov 26, 2022
Real-Time High-Resolution Background Matting

Real-Time High-Resolution Background Matting Official repository for the paper Real-Time High-Resolution Background Matting. Our model requires captur

Peter Lin 6.1k Jan 03, 2023
基于Paddle框架的fcanet复现

fcanet-Paddle 基于Paddle框架的fcanet复现 fcanet 本项目基于paddlepaddle框架复现fcanet,并参加百度第三届论文复现赛,将在2021年5月15日比赛完后提供AIStudio链接~敬请期待 参考项目: frazerlin-fcanet 数据准备 本项目已挂

QuanHao Guo 7 Mar 07, 2022
Code release of paper "Deep Multi-View Stereo gone wild"

Deep MVS gone wild Pytorch implementation of "Deep MVS gone wild" (Paper | website) This repository provides the code to reproduce the experiments of

François Darmon 53 Dec 24, 2022
Robot Hacking Manual (RHM). From robotics to cybersecurity. Papers, notes and writeups from a journey into robot cybersecurity.

RHM: Robot Hacking Manual Download in PDF RHM v0.4 ┃ Read online The Robot Hacking Manual (RHM) is an introductory series about cybersecurity for robo

Víctor Mayoral Vilches 233 Dec 30, 2022
Multilingual Image Captioning

Multilingual Image Captioning Authors: Bhavitvya Malik, Gunjan Chhablani Demo Link: https://huggingface.co/spaces/flax-community/multilingual-image-ca

Gunjan Chhablani 32 Nov 25, 2022
Pytorch for Segmentation

Pytorch for Semantic Segmentation This repo has been deprecated currently and I will not maintain it. Meanwhile, I strongly recommend you can refer to

ycszen 411 Nov 22, 2022
Zeyuan Chen, Yangchao Wang, Yang Yang and Dong Liu.

Principled S2R Dehazing This repository contains the official implementation for PSD Framework introduced in the following paper: PSD: Principled Synt

zychen 78 Dec 30, 2022
The official PyTorch implementation of Curriculum by Smoothing (NeurIPS 2020, Spotlight).

Curriculum by Smoothing (NeurIPS 2020) The official PyTorch implementation of Curriculum by Smoothing (NeurIPS 2020, Spotlight). For any questions reg

PAIR Lab 36 Nov 23, 2022
Emotion Recognition from Facial Images

Reconhecimento de Emoções a partir de imagens faciais Este projeto implementa um classificador simples que utiliza técncias de deep learning e transfe

Gabriel 2 Feb 09, 2022
Air Quality Prediction Using LSTM

AirQualityPredictionUsingLSTM In this Repo, i present to you the winning solution of smart gujarat hackathon 2019 where the task was to predict the qu

Deepak Nandwani 2 Dec 13, 2022
A super lightweight Lagrangian model for calculating millions of trajectories using ERA5 data

Easy-ERA5-Trck Easy-ERA5-Trck Galleries Install Usage Repository Structure Module Files Version iteration Easy-ERA5-Trck is a super lightweight Lagran

Zhenning Li 26 Nov 19, 2022
An implementation of "Optimal Textures: Fast and Robust Texture Synthesis and Style Transfer through Optimal Transport"

Optex An implementation of Optimal Textures: Fast and Robust Texture Synthesis and Style Transfer through Optimal Transport for TU Delft CS4240. You c

Hans Brouwer 33 Jan 05, 2023
Code & Data for Enhancing Photorealism Enhancement

Enhancing Photorealism Enhancement Stephan R. Richter, Hassan Abu AlHaija, Vladlen Koltun Paper | Website (with side-by-side comparisons) | Video (Pap

Intelligent Systems Lab Org 1.1k Dec 31, 2022
Implementation of CVPR 2020 Dual Super-Resolution Learning for Semantic Segmentation

Dual super-resolution learning for semantic segmentation 2021-01-02 Subpixel Update Happy new year! The 2020-12-29 update of SISR with subpixel conv p

Sam 79 Nov 24, 2022
An unofficial styleguide and best practices summary for PyTorch

A PyTorch Tools, best practices & Styleguide This is not an official style guide for PyTorch. This document summarizes best practices from more than a

IgorSusmelj 1.5k Jan 05, 2023