A PyTorch Implementation of End-to-End Models for Speech-to-Text

Last update: Dec 25, 2022

Related tags

Overview

speech

Speech is an open-source package to build end-to-end models for automatic speech recognition. Sequence-to-sequence models with attention, Connectionist Temporal Classification and the RNN Sequence Transducer are currently supported.

The goal of this software is to facilitate research in end-to-end models for speech recognition. The models are implemented in PyTorch.

The software has only been tested in Python3.6.

We will not be providing backward compatability for Python2.7.

Install

We recommend creating a virtual environment and installing the python requirements there.

virtualenv <path_to_your_env>
source <path_to_your_env>/bin/activate
pip install -r requirements.txt

Then follow the installation instructions for a version of PyTorch which works for your machine.

After all the python requirements are installed, from the top level directory, run:

make

The build process requires CMake as well as Make.

After that, source the setup.sh from the repo root.

source setup.sh

Consider adding this to your bashrc.

You can verify the install was successful by running the tests from the tests directory.

cd tests
pytest

Run

To train a model run

python train.py <path_to_config>

After the model is done training you can evaluate it with

python eval.py <path_to_model> <path_to_data_json>

To see the available options for each script use -h:

python {train, eval}.py -h

Examples

For examples of model configurations and datasets, visit the examples directory. Each example dataset should have instructions and/or scripts for downloading and preparing the data. There should also be one or more model configurations available. The results for each configuration will documented in each examples corresponding README.md.

A PyTorch Implementation of End-to-End Models for Speech-to-Text

Related tags

Overview

speech

Install

Run

Examples

Owner

Awni Hannun

Source code and dataset for ACL 2019 paper "ERNIE: Enhanced Language Representation with Informative Entities"

Tutorial to pretrain & fine-tune a 🤗 Flax T5 model on a TPUv3-8 with GCP

Shellcode antivirus evasion framework

📔️ Generate a text-based journal from a template file.

This is a really simple text-to-speech app made with python and tkinter.

AI-powered literature discovery and review engine for medical/scientific papers

Natural Language Processing

Paddlespeech Streaming ASR GUI

RoNER is a Named Entity Recognition model based on a pre-trained BERT transformer model trained on RONECv2

Repositório da disciplina no semestre 2021-2

A Python script which randomly chooses and prints a file from a directory.

Tensorflow Implementation of A Generative Flow for Text-to-Speech via Monotonic Alignment Search

Universal Adversarial Triggers for Attacking and Analyzing NLP (EMNLP 2019)

Train GPT-3 model on V100(16GB Mem) Using improved Transformer.

Python library for processing Chinese text

A PyTorch implementation of VIOLET

Contact Extraction with Question Answering.

Code for the paper "Flexible Generation of Natural Language Deductions"

CCQA A New Web-Scale Question Answering Dataset for Model Pre-Training

A list of NLP(Natural Language Processing) tutorials