iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating Inverse Short-time Fourier Transform

Last update: Jan 02, 2023

Overview

iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating Inverse Short-time Fourier Transform

This repo try to implement iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating Inverse Short-time Fourier Transform specifically model C8C8I. Disclaimer : This repo is build for testing purpose. The code is not optimized for performance.

Training :

python train.py --config config_v1.json

Note:

We are able to get good quality of audio with 30 % less training compared to original hifigan.
This model approx 60 % faster than counterpart hifigan.

Citations :

@inproceedings{kaneko2022istftnet,
title={{iSTFTNet}: Fast and Lightweight Mel-Spectrogram Vocoder Incorporating Inverse Short-Time Fourier Transform},
author={Takuhiro Kaneko and Kou Tanaka and Hirokazu Kameoka and Shogo Seki},
booktitle={ICASSP},
year={2022},
}

References:

https://github.com/jik876/hifi-gan

Owner

Rishikesh (ऋषिकेश)

GitHub Repository

multi-label，classifier，text classification，多标签文本分类，文本分类，BERT，ALBERT，multi-label-classification，seq2seq，attention，beam search

30 Dec 12, 2022

Scene Text Retrieval via Joint Text Detection and Similarity Learning

This is the code of "Scene Text Retrieval via Joint Text Detection and Similarity Learning". For more details, please refer to our CVPR2021 paper.

79 Nov 29, 2022

Dope Wars game engine on StarkNet L2 roll-up

RYO Dope Wars game engine on StarkNet L2 roll-up. What TI-83 drug wars built as smart contract system. Background mechanism design notion here. Initia

104 Dec 04, 2022

Automated question generation and question answering from Turkish texts using text-to-text transformers

Turkish Question Generation Offical source code for "Automated question generation & question answering from Turkish texts using text-to-text transfor

29 Dec 14, 2022

Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech tagging and word segmentation.

Pytorch-NLU，一个中文文本分类、序列标注工具包，支持中文长文本、短文本的多类、多标签分类任务，支持中文命名实体识别、词性标注、分词等序列标注任务。 Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classifi

186 Dec 24, 2022

PocketSphinx is a lightweight speech recognition engine, specifically tuned for handheld and mobile devices, though it works equally well on the desktop

molten A minimal, extensible, fast and productive API framework for Python 3. Changelog: https://moltenframework.com/changelog.html Community: https:/

3.2k Dec 28, 2022

Code for Findings at EMNLP 2021 paper: "Learn Continually, Generalize Rapidly: Lifelong Knowledge Accumulation for Few-shot Learning"

Learn Continually, Generalize Rapidly: Lifelong Knowledge Accumulation for Few-shot Learning This repo is for Findings at EMNLP 2021 paper: Learn Cont

6 Sep 02, 2022

BERT Attention Analysis

BERT Attention Analysis This repository contains code for What Does BERT Look At? An Analysis of BERT's Attention. It includes code for getting attent

401 Dec 11, 2022

100+ Chinese Word Vectors 上百种预训练中文词向量

Chinese Word Vectors 中文词向量中文 This project provides 100+ Chinese Word Vectors (embeddings) trained with different representations (dense and sparse),

10.4k Jan 09, 2023

To create a deep learning model which can explain the content of an image in the form of speech through caption generation with attention mechanism on Flickr8K dataset.

0 Feb 08, 2022

Language-Agnostic SEntence Representations

LASER Language-Agnostic SEntence Representations LASER is a library to calculate and use multilingual sentence embeddings. NEWS 2019/11/08 CCMatrix is

3.2k Jan 04, 2023

Associated Repository for "Translation between Molecules and Natural Language"

MolT5: Translation between Molecules and Natural Language Associated repository for "Translation between Molecules and Natural Language". Table of Con

67 Dec 15, 2022

A2T: Towards Improving Adversarial Training of NLP Models (EMNLP 2021 Findings)

A2T: Towards Improving Adversarial Training of NLP Models This is the source code for the EMNLP 2021 (Findings) paper "Towards Improving Adversarial T

17 Oct 15, 2022

Code for hyperboloid embeddings for knowledge graph entities

Implementation for the papers: Self-Supervised Hyperboloid Representations from Logical Queries over Knowledge Graphs, Nurendra Choudhary, Nikhil Rao,

30 Dec 10, 2022

Almost State-of-the-art Text Generation library

Ps: we are adding transformer model soon Text Gen 🐐 Almost State-of-the-art Text Generation library Text gen is a python library that allow you build

63 Jun 24, 2022

Binary LSTM model for text classification

Text Classification The purpose of this repository is to create a neural network model of NLP with deep learning for binary classification of texts re

1 Mar 11, 2022

Basic yet complete Machine Learning pipeline for NLP tasks

Basic yet complete Machine Learning pipeline for NLP tasks This repository accompanies the article on building basic yet complete ML pipelines for sol

20 Aug 22, 2022

Machine Learning Course Project, IMDB movie review sentiment analysis by lstm, cnn, and transformer

IMDB Sentiment Analysis This is the final project of Machine Learning Courses in Huazhong University of Science and Technology, School of Artificial I

0 Dec 27, 2021

Simple text to phones converter for multiple languages

Phonemizer -- foʊnmaɪzɚ The phonemizer allows simple phonemization of words and texts in many languages. Provides both the phonemize command-line tool

762 Dec 29, 2022

無料で使える中品質なテキスト読み上げソフトウェア、VOICEVOXの音声合成エンジン

VOICEVOX ENGINE VOICEVOXの音声合成エンジン。実態は HTTP サーバーなので、リクエストを送信すればテキスト音声合成できます。 API ドキュメント VOICEVOX ソフトウェアを起動した状態で、ブラウザから

3 Jul 05, 2022

iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating Inverse Short-time Fourier Transform

Related tags

Overview

iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating Inverse Short-time Fourier Transform

Training :

Note:

Citations :

References:

Owner

Rishikesh (ऋषिकेश)

multi-label，classifier，text classification，多标签文本分类，文本分类，BERT，ALBERT，multi-label-classification，seq2seq，attention，beam search

Scene Text Retrieval via Joint Text Detection and Similarity Learning

Dope Wars game engine on StarkNet L2 roll-up

Automated question generation and question answering from Turkish texts using text-to-text transformers

Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech tagging and word segmentation.

PocketSphinx is a lightweight speech recognition engine, specifically tuned for handheld and mobile devices, though it works equally well on the desktop

Code for Findings at EMNLP 2021 paper: "Learn Continually, Generalize Rapidly: Lifelong Knowledge Accumulation for Few-shot Learning"

BERT Attention Analysis

100+ Chinese Word Vectors 上百种预训练中文词向量

To create a deep learning model which can explain the content of an image in the form of speech through caption generation with attention mechanism on Flickr8K dataset.

Language-Agnostic SEntence Representations

Associated Repository for "Translation between Molecules and Natural Language"

A2T: Towards Improving Adversarial Training of NLP Models (EMNLP 2021 Findings)

Code for hyperboloid embeddings for knowledge graph entities

Almost State-of-the-art Text Generation library

Binary LSTM model for text classification

Basic yet complete Machine Learning pipeline for NLP tasks

Machine Learning Course Project, IMDB movie review sentiment analysis by lstm, cnn, and transformer

Simple text to phones converter for multiple languages

無料で使える中品質なテキスト読み上げソフトウェア、VOICEVOXの音声合成エンジン