SpeechBrain is an open-source and all-in-one speech toolkit based on PyTorch.

Last update: Jan 02, 2023

Overview

The SpeechBrain Toolkit

SpeechBrain is an open-source and all-in-one speech toolkit based on PyTorch.

The goal is to create a single, flexible, and user-friendly toolkit that can be used to easily develop state-of-the-art speech technologies, including systems for speech recognition, speaker recognition, speech enhancement, multi-microphone signal processing and many others.

SpeechBrain is currently in beta.

News: the call for new sponsors (2022) is open. Take a look here if you are interested!

Key features

SpeechBrain provides various useful tools to speed up and facilitate research on speech technologies:

Various pretrained models nicely integrated with _{(HuggingFace)} in our official organization account. These models are given with an interface to easily run inference, facilitating integration. If a HuggingFace model isn't available, we usually provide a least a Google Drive folder containing all the experimental results corresponding.
The Brain class, a fully-customizable tool for managing training and evaluation loops over data. The annoying details of training loops are handled for you while retaining complete flexibility to override any part of the process when needed.
A YAML-based hyperparameter specification language that describes all types of hyperparameters, from individual numbers (e.g. learning rate) to complete objects (e.g. custom models). This dramatically simplifies recipe code by distilling basic algorithmic components.
Multi-GPU training and inference with PyTorch Data-Parallel or Distributed Data-Parallel.
Mixed-precision for faster training.
A transparent and entirely customizable data input and output pipeline. SpeechBrain follows the PyTorch data loader and dataset style and enables users to customize the i/o pipelines (e.g adding on-the-fly downsampling, BPE tokenization, sorting, threshold ...).
A nice integration of sharded data with WebDataset optimized for very large datasets on Nested File Systems (NFS).

Speech recognition

SpeechBrain supports state-of-the-art methods for end-to-end speech recognition:

Support of wav2vec 2.0 pretrained model with finetuning.
State-of-the-art performance or comparable with other existing toolkits in several ASR benchmarks.
Easily customizable neural language models including RNNLM and TransformerLM. We also propose few pre-trained models to save you computations (more to come!). We support the Hugging Face dataset to facilitate the training over a large text dataset.
Hybrid CTC/Attention end-to-end ASR:
- Many available encoders: CRDNN (VGG + {LSTM,GRU,LiGRU} + DNN), ResNet, SincNet, vanilla transformers, contextnet-based transformers or conformers. Thanks to the flexibility of SpeechBrain, any fully customized encoder could be connected to the CTC/attention decoder and trained in few hours of work. The decoder is fully customizable as well: LSTM, GRU, LiGRU, transformer, or your neural network!
- Optimised and fast beam search on both CPUs or GPUs.
Transducer end-to-end ASR with a custom Numba loss to accelerate the training. Any encoder or decoder can be plugged into the transducer ranging from VGG+RNN+DNN to conformers.
Pre-trained ASR models for transcribing an audio file or extracting features for a downstream task.

Feature extraction and augmentation

SpeechBrain provides efficient and GPU-friendly speech augmentation pipelines and acoustic feature extraction:

On-the-fly and fully-differentiable acoustic feature extraction: filter banks can be learned. This simplifies the training pipeline (you don't have to dump features on disk).
On-the-fly feature normalization (global, sentence, batch, or speaker level).
On-the-fly environmental corruptions based on noise, reverberation, and babble for robust model training.
On-the-fly frequency and time domain SpecAugment.

Speaker recognition, identification and diarization

SpeechBrain provides different models for speaker recognition, identification, and diarization on different datasets:

State-of-the-art performance on speaker recognition and diarization based on ECAPA-TDNN models.
Original Xvectors implementation (inspired by Kaldi) with PLDA.
Spectral clustering for speaker diarization (combined with speakers embeddings).
Libraries to extract speaker embeddings with a pre-trained model on your data.

Speech Translation

Recipes for transformer and conformer-based end-to-end speech translation.
Possibility to choose between normal training (Attention), multi-objectives (CTC+Attention) and multitasks (ST + ASR).

Speech enhancement and separation

Recipes for spectral masking, spectral mapping, and time-domain speech enhancement.
Multiple sophisticated enhancement losses, including differentiable STOI loss, MetricGAN, and mimic loss.
State-of-the-art performance on speech separation with Conv-TasNet, DualPath RNN, and SepFormer.

Multi-microphone processing

Combining multiple microphones is a powerful approach to achieve robustness in adverse acoustic environments:

Delay-and-sum, MVDR, and GeV beamforming.
Speaker localization.

Performance

The recipes released with speechbrain implement speech processing systems with competitive or state-of-the-art performance. In the following, we report the best performance achieved on some popular benchmarks:

Dataset	Task	System	Performance
LibriSpeech	Speech Recognition	CNN + Transformer	WER=2.46% (test-clean)
TIMIT	Speech Recognition	CRDNN + distillation	PER=13.1% (test)
TIMIT	Speech Recognition	wav2vec2 + CTC/Att.	PER=8.04% (test)
CommonVoice (English)	Speech Recognition	wav2vec2 + CTC	WER=15.69% (test)
CommonVoice (French)	Speech Recognition	wav2vec2 + CTC	WER=9.96% (test)
CommonVoice (Italian)	Speech Recognition	wav2vec2 + seq2seq	WER=9.86% (test)
CommonVoice (Kinyarwanda)	Speech Recognition	wav2vec2 + seq2seq	WER=18.91% (test)
AISHELL (Mandarin)	Speech Recognition	wav2vec2 + seq2seq	CER=5.58% (test)
Fisher-callhome (spanish)	Speech translation	conformer (ST + ASR)	BLEU=48.04 (test)
VoxCeleb2	Speaker Verification	ECAPA-TDNN	EER=0.69% (vox1-test)
AMI	Speaker Diarization	ECAPA-TDNN	DER=3.01% (eval)
VoiceBank	Speech Enhancement	MetricGAN+	PESQ=3.08 (test)
WSJ2MIX	Speech Separation	SepFormer	SDRi=22.6 dB (test)
WSJ3MIX	Speech Separation	SepFormer	SDRi=20.0 dB (test)
WHAM!	Speech Separation	SepFormer	SDRi= 16.4 dB (test)
WHAMR!	Speech Separation	SepFormer	SDRi= 14.0 dB (test)
Libri2Mix	Speech Separation	SepFormer	SDRi= 20.6 dB (test-clean)
Libri3Mix	Speech Separation	SepFormer	SDRi= 18.7 dB (test-clean)
LibryParty	Voice Activity Detection	CRDNN	F-score=0.9477 (test)
IEMOCAP	Emotion Recognition	wav2vec	Accuracy=79.8% (test)
CommonLanguage	Language Recognition	ECAPA-TDNN	Accuracy=84.9% (test)
Timers and Such	Spoken Language Understanding	CRDNN	Sentence Accuracy=89.2% (test)

For more details, take a look into the corresponding implementation in recipes/dataset/.

Pretrained Models

Beyond providing recipes for training the models from scratch, SpeechBrain shares several pre-trained models (coupled with easy-inference functions) on HuggingFace. In the following, we report some of them:

Task	Dataset	Model
Speech Recognition	LibriSpeech	CNN + Transformer
Speech Recognition	LibriSpeech	CRDNN
Speech Recognition	CommonVoice(English)	wav2vec + CTC
Speech Recognition	CommonVoice(French)	wav2vec + CTC
Speech Recognition	CommonVoice(Italian)	wav2vec + CTC
Speech Recognition	CommonVoice(Kinyarwanda)	wav2vec + CTC
Speech Recognition	AISHELL(Mandarin)	wav2vec + CTC
Speaker Recognition	Voxceleb	ECAPA-TDNN
Speech Separation	WHAMR!	SepFormer
Speech Enhancement	Voicebank	MetricGAN+
Spoken Language Understanding	Timers and Such	CRDNN
Language Identification	CommonLanguage	ECAPA-TDNN

Documentation & Tutorials

SpeechBrain is designed to speed-up research and development of speech technologies. Hence, our code is backed-up with three different levels of documentation:

Low-level: during the review process of the different pull requests, we are focusing on the level of comments that are given. Hence, any complex functionality or long pipeline is supported with helpful comments enabling users to handily customize the code.
Functional-level: all classes in SpeechBrain contains a detailed docstring that details the input and output formats, the different arguments, the usage of the function, the potentially associated bibliography, and a function example that is used for test integration during pull requests. Such examples can also be used to manipulate a class or a function to properly understand what is exactly happening.
Educational-level: we provide various Google Colab (i.e. interactive) tutorials describing all the building-blocks of SpeechBrain ranging from the core of the toolkit to a specific model designed for a particular task. The number of available tutorials is expected to increase over time.

Under development

We are currently working towards integrating DNN-HMM for speech recognition and machine translation.

Quick installation

SpeechBrain is constantly evolving. New features, tutorials, and documentation will appear over time. SpeechBrain can be installed via PyPI to rapidly use the standard library. Moreover, a local installation can be used by those users that what to run experiments and modify/customize the toolkit. SpeechBrain supports both CPU and GPU computations. For most all the recipes, however, a GPU is necessary during training. Please note that CUDA must be properly installed to use GPUs.

Install via PyPI

Once you have created your Python environment (Python 3.8+) you can simply type:

pip install speechbrain

Then you can access SpeechBrain with:

import speechbrain as sb

Install with GitHub

Once you have created your Python environment (Python 3.8+) you can simply type:

git clone https://github.com/speechbrain/speechbrain.git
cd speechbrain
pip install -r requirements.txt
pip install --editable .

Then you can access SpeechBrain with:

import speechbrain as sb

Any modification made to the speechbrain package will be automatically interpreted as we installed it with the --editable flag.

Test Installation

Please, run the following script to make sure your installation is working:

pytest tests
pytest --doctest-modules speechbrain

Running an experiment

In SpeechBrain, you can run experiments in this way:

> cd recipes/
   
    /
    
     /
> python experiment.py params.yaml

The results will be saved in the output_folder specified in the yaml file. The folder is created by calling sb.core.create_experiment_directory() in experiment.py. Both detailed logs and experiment outputs are saved there. Furthermore, less verbose logs are output to stdout.

SpeechBrain Roadmap

As a community-based and open source project, SpeechBrain needs the help of its community to grow in the right direction. Opening the roadmap to our users enable the toolkit to benefit from new ideas, new research axes or even new technologies. The roadmap, available on our Discourse lists all the changes and updates that need to be done in the current version of SpeechBrain. Users are more than welcome to propose new items via new Discourse topics!

Learning SpeechBrain

Instead of a long and boring README, we prefer to provide you with different resources that can be used to learn how to customize SpeechBrain to adapt it to your needs:

General information can be found on the website.
We offer many tutorials, you can start out from the basic ones about SpeechBrain basic functionalities and building blocks. We provide also more advanced tutorials (e.g SpeechBrain advanced, signal processing ...). You can browse them via the Tutorials drop down menu on SpeechBrain website in the upper right.
Details on the SpeechBrain API, how to contribute, and the code are given in the documentation.

License

SpeechBrain is released under the Apache License, version 2.0. The Apache license is a popular BSD-like license. SpeechBrain can be redistributed for free, even for commercial purposes, although you can not take off the license headers (and under some circumstances, you may have to distribute a license document). Apache is not a viral license like the GPL, which forces you to release your modifications to the source code. Also note that this project has no connection to the Apache Foundation, other than that we use the same license terms.

Citing SpeechBrain

Please, cite SpeechBrain if you use it for your research or business.

@misc{speechbrain,
  title={{SpeechBrain}: A General-Purpose Speech Toolkit},
  author={Mirco Ravanelli and Titouan Parcollet and Peter Plantinga and Aku Rouhe and Samuele Cornell and Loren Lugosch and Cem Subakan and Nauman Dawalatabad and Abdelwahab Heba and Jianyuan Zhong and Ju-Chieh Chou and Sung-Lin Yeh and Szu-Wei Fu and Chien-Feng Liao and Elena Rastorgueva and François Grondin and William Aris and Hwidong Na and Yan Gao and Renato De Mori and Yoshua Bengio},
  year={2021},
  eprint={2106.04624},
  archivePrefix={arXiv},
  primaryClass={eess.AS},
  note={arXiv:2106.04624}
}

Comments

Add Transducer recipe
Hello @mravanelli , @TParcollet , @jjery2243542 ,

This is a work in progress transducer recipe, the following tasks are addressed:

[x] add transducer joint module

[x] REMOVED:add seq2seq bool in Brain class to handle the [x,y] input for the compute_forward function

[x] add embedding for the Prediction Network

[x] add greedy decoding

[x] Transducer minimal recipe

[x] add Transducer seq2seq recipe for TIMIT

[x] add comments to explain the greedy search over the transducer

[x] Add transducer recipe for Librispeech

[x] Find the good architecture with 14 % wer

enhancement refactor ready to review
opened by aheba 73
use sentencepiece lib from google
Add BPE tokenizer:

[x] add the BPE training

[x] use the BPE trained model for the token generation for Librispeech recipe

[x] Design the way of adding the BPE on the params (yaml file)

enhancement ready to review
opened by aheba 52
Switchboard Recipe
Hey everybody,

I made a recipe for the Switchboard corpus. The data preparation steps mostly follow Kaldi's s5c recipe.

The recipe includes the following models:

ASR

CTC: Wav2Vec2 Encoder + CTC Decoder (adapted from the Commonvoice recipes)

seq2seq: CRDNN encoder + GRU Decoder + Attention (adapted from the LibriSpeech recipe)

Note: Unlike the Librispeech recipe, this system does not include any LM. In fact, every LM I tried (pretrained, finetuned or trained from scratch) seemed to make the performance much worse

transformer: Transformer model + LM (adapted from the LibriSpeech recipe)

LM

There are two hparams files for finetuning existing LibriSpeech LMs on Switchboard and Fisher data, one for an RNNLM and the other for a Transformer LM

Tokenizer

Basic Sentencepiece Tokenizer training on Switchboard and Fisher data

Performance The model performance is as follows: | Model | Swbd WER | Callhome WER | Eval2000 WER | |:---------------------------------:|:-----------:|:---------------:| :---------------:| | CTC | 21.35 | 28.32 | 24.91 | | seq2seq | 25.37 | 36.87 | 29.33 | | Transformer (LibriSpeech LM) | 22.00 | 30.12 | 26.14 | | Transformer (Finetuned LM) | 21.11 | 29.43 | 25.36 |

As you can see, the performance is currently comparable to Kaldi's chain systems without i-vectors. However, they need some refinement to be on par with the best Kaldi systems available (WER should be around 18 on the full eval2000 testset).

If you have any suggestions for improvements, I'd be happy to implement them.

I can also provide the trained models in case you are interested (I might need some help with this whole Huggingface thing though).

Best, Dominik

ps Thanks for all the great work you've done here! :)
enhancement
opened by dwgnr 50
$handle the use of multigpu_{count,backend}$

handle the use of multigpu_{count,backend}

Hey @pplantinga , @mravanelli , Here is a PR fixing the issue #395 . As discussed, the multigpu_{count, backend} are not used in our ddp.py, currently, the multigpu_{count, backend} is used in the hyperparamsfile only with data_parallel. This PR handle the use of multigpu_{count, backend} by DDP.py. If the use set this params in the command line, the params in the yaml file is omitted.
help wanted work in progress ready to review

opened by aheba 50
add noise and reverberance version for BinauralWSJ0Mix

Hi there, I have created a noise and reverberance version of BinauralWSJ0Mix datasets and trained with convtasnet-parallel structure. Here are the recipes and not conflicted with the clean version of datasets. Also, I have trained convtasnet-parallel.yaml again and got a better results which I could share you with the Google Driver. Thanks.

opened by huangzj421 43
Aishell1Mix

This branch adds a new task named Aishell1Mix to the recipes which is similar to the LibriMix but applied to the mandarin AISHELL-1 dataset. Hope to receive your reply. Much thanks.
enhancement

opened by huangzj421 42
training on voxceleb1+2 is very slow?
Dear all: I noticed that when training on voxceleb1+2, it will take me up to 25 hours for single epoch. and even with ddp on 4 gpu cards, the training speed does not reduce at all. I guess the cpu is the bottleneck? anyone has the same phenomena? thank you.

7%|████████▎ | 16569/241547 [1:45:07<25:09:56, 2.48it/s, train_loss=13
question
opened by dragen1860 35

Insertion problem when decoding with pre-trained ASR model.

Thanks for the clear example In foldertemplates/speech_recognition/ASR/ to train an ASR model on mini-librispeech dataset. However, when I used the librispeech-pretrained model (ASR model, language model and tokenizer) to decode some waveforms in librispeech test dataset, the decoding result will repeat some of the words many times and cause severe insertion errors. Below is several examples:

1221-135766-0014, %WER 2436.36 [ 268 / 11, 268 ins, 0 del, 0 sub ]
PEARL ; SAW ; AND ; GAZED ; INTENTLY ; BUT ; NEVER ; SOUGHT ; TO ; MAKE ; ACQUAINTANCE ; <eps> ; <eps> ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ; <eps>  ; <eps> ; <eps> ;    <eps>     ; <eps> ; <eps> ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ; <eps>  ; <eps> ; <eps> ;    <eps>     ; <eps> ; <eps> ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ; <eps>  ; <eps> ; <eps> ;    <eps>     ; <eps> ; <eps> ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ; <eps>  ; <eps> ; <eps> ;    <eps>     ; <eps> ; <eps> ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ; <eps>  ; <eps> ; <eps> ;    <eps>     ; <eps> ; <eps> ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ; <eps>  ; <eps> ; <eps> ;    <eps>     ; <eps> ; <eps> ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ; <eps>  ; <eps> ; <eps> ;    <eps>     ; <eps> ; <eps> ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ; <eps>  ; <eps> ; <eps> ;    <eps>     ; <eps> ; <eps> ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ; <eps>  ; <eps> ; <eps> ;    <eps>     ; <eps> ; <eps> ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ; <eps>  ; <eps> ; <eps> ;    <eps>     ; <eps> ; <eps> ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ; <eps>  ; <eps> ; <eps> ;    <eps>     ; <eps> ; <eps> ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ; <eps>  ; <eps> ; <eps> ;    <eps>     ; <eps> ; <eps> ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ; <eps>  ; <eps> ; <eps> ;    <eps>     ; <eps> ; <eps> ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ; <eps>  ; <eps> ; <eps> ;    <eps>     ; <eps> ; <eps> ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ; <eps>  ; <eps> ; <eps> ;    <eps>     ; <eps> ; <eps> ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ; <eps>  ; <eps> ; <eps> ;    <eps>     ; <eps> ; <eps> ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ; <eps>  ; <eps> ; <eps> ;    <eps>     ; <eps> ; <eps> ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ; <eps>  ; <eps> ; <eps> ;    <eps>     ; <eps> ; <eps> ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ; <eps>  ; <eps> ; <eps> ;    <eps>     ; <eps> ; <eps> ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ; <eps>  ; <eps> ; <eps> ;    <eps>     ; <eps> ; <eps> ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ; <eps>  ; <eps> ; <eps> ;    <eps>     ; <eps> ; <eps> ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ; <eps>  ; <eps> ; <eps> ;    <eps>     ; <eps> ; <eps> ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ; <eps>  ; <eps> ; <eps> ;    <eps>     ; <eps> ; <eps> ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ; <eps>  ; <eps> ; <eps> ;    <eps>     ; <eps> ; <eps> ; <eps> ; <eps>
  =   ;  =  ;  =  ;   =   ;    =     ;  =  ;   =   ;   =    ; =  ;  =   ;      =       ;   I   ;   I   ;   I   ;   I   ;    I     ;   I   ;   I   ;   I    ;   I   ;   I   ;      I       ;   I   ;   I   ;   I   ;   I   ;    I     ;   I   ;   I   ;   I    ;   I   ;   I   ;      I       ;   I   ;   I   ;   I   ;   I   ;    I     ;   I   ;   I   ;   I    ;   I   ;   I   ;      I       ;   I   ;   I   ;   I   ;   I   ;    I     ;   I   ;   I   ;   I    ;   I   ;   I   ;      I       ;   I   ;   I   ;   I   ;   I   ;    I     ;   I   ;   I   ;   I    ;   I   ;   I   ;      I       ;   I   ;   I   ;   I   ;   I   ;    I     ;   I   ;   I   ;   I    ;   I   ;   I   ;      I       ;   I   ;   I   ;   I   ;   I   ;    I     ;   I   ;   I   ;   I    ;   I   ;   I   ;      I       ;   I   ;   I   ;   I   ;   I   ;    I     ;   I   ;   I   ;   I    ;   I   ;   I   ;      I       ;   I   ;   I   ;   I   ;   I   ;    I     ;   I   ;   I   ;   I    ;   I   ;   I   ;      I       ;   I   ;   I   ;   I   ;   I   ;    I     ;   I   ;   I   ;   I    ;   I   ;   I   ;      I       ;   I   ;   I   ;   I   ;   I   ;    I     ;   I   ;   I   ;   I    ;   I   ;   I   ;      I       ;   I   ;   I   ;   I   ;   I   ;    I     ;   I   ;   I   ;   I    ;   I   ;   I   ;      I       ;   I   ;   I   ;   I   ;   I   ;    I     ;   I   ;   I   ;   I    ;   I   ;   I   ;      I       ;   I   ;   I   ;   I   ;   I   ;    I     ;   I   ;   I   ;   I    ;   I   ;   I   ;      I       ;   I   ;   I   ;   I   ;   I   ;    I     ;   I   ;   I   ;   I    ;   I   ;   I   ;      I       ;   I   ;   I   ;   I   ;   I   ;    I     ;   I   ;   I   ;   I    ;   I   ;   I   ;      I       ;   I   ;   I   ;   I   ;   I   ;    I     ;   I   ;   I   ;   I    ;   I   ;   I   ;      I       ;   I   ;   I   ;   I   ;   I   ;    I     ;   I   ;   I   ;   I    ;   I   ;   I   ;      I       ;   I   ;   I   ;   I   ;   I   ;    I     ;   I   ;   I   ;   I    ;   I   ;   I   ;      I       ;   I   ;   I   ;   I   ;   I   ;    I     ;   I   ;   I   ;   I    ;   I   ;   I   ;      I       ;   I   ;   I   ;   I   ;   I   ;    I     ;   I   ;   I   ;   I    ;   I   ;   I   ;      I       ;   I   ;   I   ;   I   ;   I   ;    I     ;   I   ;   I   ;   I    ;   I   ;   I   ;      I       ;   I   ;   I   ;   I   ;   I   ;    I     ;   I   ;   I   ;   I    ;   I   ;   I   ;      I       ;   I   ;   I   ;   I   ;   I   ;    I     ;   I   ;   I   ;   I    ;   I   ;   I   ;      I       ;   I   ;   I   ;   I   ;   I  
PEARL ; SAW ; AND ; GAZED ; INTENTLY ; BUT ; NEVER ; SOUGHT ; TO ; MAKE ; ACQUAINTANCE ; PEARL ;  SAW  ;  AND  ; GAZED ; INTENTLY ;  BUT  ; NEVER ; SOUGHT ;   TO  ;  MAKE ; ACQUAINTANCE ; PEARL ;  SAW  ;  AND  ; GAZED ; INTENTLY ;  BUT  ; NEVER ; SOUGHT ;   TO  ;  MAKE ; ACQUAINTANCE ; PEARL ;  SAW  ;  AND  ; GAZED ; INTENTLY ;  BUT  ; NEVER ; SOUGHT ;   TO  ;  MAKE ; ACQUAINTANCE ; PEARL ;  SAW  ;  AND  ; GAZED ; INTENTLY ;  BUT  ; NEVER ; SOUGHT ;   TO  ;  MAKE ; ACQUAINTANCE ; PEARL ;  SAW  ;  AND  ; GAZED ; INTENTLY ;  BUT  ; NEVER ; SOUGHT ;   TO  ;  MAKE ; ACQUAINTANCE ; PEARL ;  SAW  ;  AND  ; GAZED ; INTENTLY ;  BUT  ; NEVER ; SOUGHT ;   TO  ;  MAKE ; ACQUAINTANCE ; PEARL ;  SAW  ;  AND  ; GAZED ; INTENTLY ;  BUT  ; NEVER ; SOUGHT ;   TO  ;  MAKE ; ACQUAINTANCE ; PEARL ;  SAW  ;  AND  ; GAZED ; INTENTLY ;  BUT  ; NEVER ; SOUGHT ;   TO  ;  MAKE ; ACQUAINTANCE ; PEARL ;  SAW  ;  AND  ; GAZED ; INTENTLY ;  BUT  ; NEVER ; SOUGHT ;   TO  ;  MAKE ; ACQUAINTANCE ; PEARL ;  SAW  ;  AND  ; GAZED ; INTENTLY ;  BUT  ; NEVER ; SOUGHT ;   TO  ;  MAKE ; ACQUAINTANCE ; PEARL ;  SAW  ;  AND  ; GAZED ; INTENTLY ;  BUT  ; NEVER ; SOUGHT ;   TO  ;  MAKE ; ACQUAINTANCE ; PEARL ;  SAW  ;  AND  ; GAZED ; INTENTLY ;  BUT  ; NEVER ; SOUGHT ;   TO  ;  MAKE ; ACQUAINTANCE ; PEARL ;  SAW  ;  AND  ; GAZED ; INTENTLY ;  BUT  ; NEVER ; SOUGHT ;   TO  ;  MAKE ; ACQUAINTANCE ; PEARL ;  SAW  ;  AND  ; GAZED ; INTENTLY ;  BUT  ; NEVER ; SOUGHT ;   TO  ;  MAKE ; ACQUAINTANCE ; PEARL ;  SAW  ;  AND  ; GAZED ; INTENTLY ;  BUT  ; NEVER ; SOUGHT ;   TO  ;  MAKE ; ACQUAINTANCE ; PEARL ;  SAW  ;  AND  ; GAZED ; INTENTLY ;  BUT  ; NEVER ; SOUGHT ;   TO  ;  MAKE ; ACQUAINTANCE ; PEARL ;  SAW  ;  AND  ; GAZED ; INTENTLY ;  BUT  ; NEVER ; SOUGHT ;   TO  ;  MAKE ; ACQUAINTANCE ; PEARL ;  SAW  ;  AND  ; GAZED ; INTENTLY ;  BUT  ; NEVER ; SOUGHT ;   TO  ;  MAKE ; ACQUAINTANCE ; PEARL ;  SAW  ;  AND  ; GAZED ; INTENTLY ;  BUT  ; NEVER ; SOUGHT ;   TO  ;  MAKE ; ACQUAINTANCE ; PEARL ;  SAW  ;  AND  ; GAZED ; INTENTLY ;  BUT  ; NEVER ; SOUGHT ;   TO  ;  MAKE ; ACQUAINTANCE ; PEARL ;  SAW  ;  AND  ; GAZED ; INTENTLY ;  BUT  ; NEVER ; SOUGHT ;   TO  ;  MAKE ; ACQUAINTANCE ; PEARL ;  SAW  ;  AND  ; GAZED ; INTENTLY ;  BUT  ; NEVER ; SOUGHT ;   TO  ;  MAKE ; ACQUAINTANCE ; PEARL ;  SAW  ;  AND  ; GAZED ; INTENTLY ;  BUT  ; NEVER ; SOUGHT ;   TO  ;  MAKE ; ACQUAINTANCE ; PEARL ;  SAW  ;  AND  ; GAZED ; INTENTLY ;  BUT  ; NEVER ; SOUGHT ;   TO  ;  MAKE ; ACQUAINTANCE ; PEARL ;  SAW  ;  AND  ; GAZED

121-123859-0001, %WER 869.81 [ 461 / 53, 454 ins, 0 del, 7 sub ]
O  ; TIS ; THE ; FIRST  ; TIS ; FLATTERY ; IN ; MY ; SEEING ; AND ; MY ; GREAT ; MIND ; MOST ; KINGLY ; DRINKS ; IT ; UP ; MINE ; EYE ; WELL ; KNOWS ; WHAT ; WITH ; HIS ; GUST ; IS ; GREEING ; AND ; TO ; HIS ; PALATE ; DOTH ; PREPARE ; THE ; CUP ; IF ; IT ; BE ; POISON'D ; TIS ; THE ; LESSER ; SIN ; THAT ; MINE ; EYE ; LOVES ; IT ; AND ; DOTH ; <eps>  ; <eps>  ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; FIRST ; BEGIN ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps>  ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps>  ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps>
S  ;  =  ;  =  ;   S    ;  =  ;    =     ; =  ; =  ;   =    ;  =  ; =  ;   =   ;  =   ;  =   ;   S    ;   =    ; =  ; =  ;  =   ;  =  ;  =   ;   =   ;  =   ;  =   ;  =  ;  =   ; =  ;    S    ;  =  ; =  ;  =  ;   =    ;  =   ;    =    ;  =  ;  =  ; =  ; =  ; =  ;    S     ;  =  ;  =  ;   =    ;  =  ;  =   ;  =   ;  S  ;   S   ; =  ;  =  ;  =   ;   I    ;   I    ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   =   ;   =   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I    ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I    ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I  
OH ; TIS ; THE ; THIRST ; TIS ; FLATTERY ; IN ; MY ; SEEING ; AND ; MY ; GREAT ; MIND ; MOST ; KEENLY ; DRINKS ; IT ; UP ; MINE ; EYE ; WELL ; KNOWS ; WHAT ; WITH ; HIS ; GUST ; IS ;  GREEN  ; AND ; TO ; HIS ; PALATE ; DOTH ; PREPARE ; THE ; CUP ; IF ; IT ; BE ; POISONED ; TIS ; THE ; LESSER ; SIN ; THAT ; MINE ;  I  ;  LOVE ; IT ; AND ; DOTH ; THIRST ; BEGINS ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGINS ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGINS ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE

1284-134647-0001, %WER 707.41 [ 191 / 27, 191 ins, 0 del, 0 sub ]
THE ; EDICT ; OF ; MILAN ; THE ; GREAT ; CHARTER ; OF ; TOLERATION ; HAD ; CONFIRMED ; TO ; EACH ; INDIVIDUAL ; OF ; THE ; ROMAN ; WORLD ; THE ; PRIVILEGE ; OF ; CHOOSING ; AND ; PROFESSING ; HIS ; OWN ; RELIGION ; <eps> ; <eps> ;   <eps>   ; <eps> ; <eps> ;   <eps>    ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ;   <eps>   ; <eps> ;  <eps>   ; <eps> ;   <eps>    ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ;   <eps>   ; <eps> ; <eps> ;   <eps>    ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ;   <eps>   ; <eps> ;  <eps>   ; <eps> ;   <eps>    ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ;   <eps>   ; <eps> ; <eps> ;   <eps>    ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ;   <eps>   ; <eps> ;  <eps>   ; <eps> ;   <eps>    ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ;   <eps>   ; <eps> ; <eps> ;   <eps>    ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ;   <eps>   ; <eps> ;  <eps>   ; <eps> ;   <eps>    ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ;   <eps>   ; <eps> ; <eps> ;   <eps>    ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ;   <eps>   ; <eps> ;  <eps>   ; <eps> ;   <eps>    ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ;   <eps>   ; <eps> ; <eps> ;   <eps>    ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ;   <eps>   ; <eps> ;  <eps>   ; <eps> ;   <eps>    ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ;   <eps>   ; <eps> ; <eps> ;   <eps>    ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ;   <eps>   ; <eps> ;  <eps>   ; <eps> ;   <eps>    ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ;   <eps>   ; <eps> ; <eps> ;   <eps>    ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ;   <eps>   ; <eps> ;  <eps>   ; <eps> ;   <eps>    ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ;   <eps>   ; <eps> ; <eps> ;   <eps>    ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ;   <eps>   ; <eps> ;  <eps>   ; <eps> ;   <eps>    ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ;   <eps>   ; <eps> ; <eps> ;   <eps>    ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ;   <eps>   ; <eps> ;  <eps>   ; <eps> ;   <eps>    ; <eps> ; <eps> ;  <eps>   ; <eps>
 =  ;   =   ; =  ;   =   ;  =  ;   =   ;    =    ; =  ;     =      ;  =  ;     =     ; =  ;  =   ;     =      ; =  ;  =  ;   =   ;   =   ;  =  ;     =     ; =  ;    =     ;  =  ;     =      ;  =  ;  =  ;    =     ;   I   ;   I   ;     I     ;   I   ;   I   ;     I      ;   I   ;   I   ;   I   ;   I   ;   I   ;     I     ;   I   ;    I     ;   I   ;     I      ;   I   ;   I   ;    I     ;   I   ;   I   ;     I     ;   I   ;   I   ;     I      ;   I   ;   I   ;   I   ;   I   ;   I   ;     I     ;   I   ;    I     ;   I   ;     I      ;   I   ;   I   ;    I     ;   I   ;   I   ;     I     ;   I   ;   I   ;     I      ;   I   ;   I   ;   I   ;   I   ;   I   ;     I     ;   I   ;    I     ;   I   ;     I      ;   I   ;   I   ;    I     ;   I   ;   I   ;     I     ;   I   ;   I   ;     I      ;   I   ;   I   ;   I   ;   I   ;   I   ;     I     ;   I   ;    I     ;   I   ;     I      ;   I   ;   I   ;    I     ;   I   ;   I   ;     I     ;   I   ;   I   ;     I      ;   I   ;   I   ;   I   ;   I   ;   I   ;     I     ;   I   ;    I     ;   I   ;     I      ;   I   ;   I   ;    I     ;   I   ;   I   ;     I     ;   I   ;   I   ;     I      ;   I   ;   I   ;   I   ;   I   ;   I   ;     I     ;   I   ;    I     ;   I   ;     I      ;   I   ;   I   ;    I     ;   I   ;   I   ;     I     ;   I   ;   I   ;     I      ;   I   ;   I   ;   I   ;   I   ;   I   ;     I     ;   I   ;    I     ;   I   ;     I      ;   I   ;   I   ;    I     ;   I   ;   I   ;     I     ;   I   ;   I   ;     I      ;   I   ;   I   ;   I   ;   I   ;   I   ;     I     ;   I   ;    I     ;   I   ;     I      ;   I   ;   I   ;    I     ;   I   ;   I   ;     I     ;   I   ;   I   ;     I      ;   I   ;   I   ;   I   ;   I   ;   I   ;     I     ;   I   ;    I     ;   I   ;     I      ;   I   ;   I   ;    I     ;   I   ;   I   ;     I     ;   I   ;   I   ;     I      ;   I   ;   I   ;   I   ;   I   ;   I   ;     I     ;   I   ;    I     ;   I   ;     I      ;   I   ;   I   ;    I     ;   I  
THE ; EDICT ; OF ; MILAN ; THE ; GREAT ; CHARTER ; OF ; TOLERATION ; HAD ; CONFIRMED ; TO ; EACH ; INDIVIDUAL ; OF ; THE ; ROMAN ; WORLD ; THE ; PRIVILEGE ; OF ; CHOOSING ; AND ; PROFESSING ; HIS ; OWN ; RELIGION ;   HE  ;  HAD  ; CONFIRMED ;   TO  ;  EACH ; INDIVIDUAL ;   OF  ;  THE  ; ROMAN ; WORLD ;  THE  ; PRIVILEGE ;   OF  ; CHOOSING ;  AND  ; PROFESSING ;  HIS  ;  OWN  ; RELIGION ;   HE  ;  HAD  ; CONFIRMED ;   TO  ;  EACH ; INDIVIDUAL ;   OF  ;  THE  ; ROMAN ; WORLD ;  THE  ; PRIVILEGE ;   OF  ; CHOOSING ;  AND  ; PROFESSING ;  HIS  ;  OWN  ; RELIGION ;   HE  ;  HAD  ; CONFIRMED ;   TO  ;  EACH ; INDIVIDUAL ;   OF  ;  THE  ; ROMAN ; WORLD ;  THE  ; PRIVILEGE ;   OF  ; CHOOSING ;  AND  ; PROFESSING ;  HIS  ;  OWN  ; RELIGION ;   HE  ;  HAD  ; CONFIRMED ;   TO  ;  EACH ; INDIVIDUAL ;   OF  ;  THE  ; ROMAN ; WORLD ;  THE  ; PRIVILEGE ;   OF  ; CHOOSING ;  AND  ; PROFESSING ;  HIS  ;  OWN  ; RELIGION ;   HE  ;  HAD  ; CONFIRMED ;   TO  ;  EACH ; INDIVIDUAL ;   OF  ;  THE  ; ROMAN ; WORLD ;  THE  ; PRIVILEGE ;   OF  ; CHOOSING ;  AND  ; PROFESSING ;  HIS  ;  OWN  ; RELIGION ;   HE  ;  HAD  ; CONFIRMED ;   TO  ;  EACH ; INDIVIDUAL ;   OF  ;  THE  ; ROMAN ; WORLD ;  THE  ; PRIVILEGE ;   OF  ; CHOOSING ;  AND  ; PROFESSING ;  HIS  ;  OWN  ; RELIGION ;   HE  ;  HAD  ; CONFIRMED ;   TO  ;  EACH ; INDIVIDUAL ;   OF  ;  THE  ; ROMAN ; WORLD ;  THE  ; PRIVILEGE ;   OF  ; CHOOSING ;  AND  ; PROFESSING ;  HIS  ;  OWN  ; RELIGION ;   HE  ;  HAD  ; CONFIRMED ;   TO  ;  EACH ; INDIVIDUAL ;   OF  ;  THE  ; ROMAN ; WORLD ;  THE  ; PRIVILEGE ;   OF  ; CHOOSING ;  AND  ; PROFESSING ;  HIS  ;  OWN  ; RELIGION ;   HE  ;  HAD  ; CONFIRMED ;   TO  ;  EACH ; INDIVIDUAL ;   OF  ;  THE  ; ROMAN ; WORLD ;  THE  ; PRIVILEGE ;   OF  ; CHOOSING ;  AND  ; PROFESSING ;  HIS  ;  OWN  ; RELIGION ;   HE  ;  HAD  ; CONFIRMED ;   TO  ;  EACH ; INDIVIDUAL ;   OF  ;  THE  ; ROMAN ; WORLD ;  THE  ; PRIVILEGE ;   OF  ; CHOOSING ;  AND  ; PROFESSING ;  HIS  ;  OWN  ; RELIGION ;  THE

The dataset I tested on is part of the librispeech test-clean dataset (reader id beginning with 1, 2 and 3, 1074 files in total.), and the average WER on this dataset is 20.3%. Below is the hparams I used for searching:

 test_search: !new:speechbrain.decoders.S2SRNNBeamSearchLM
    embedding: !ref <embedding>
    decoder: !ref <decoder>
    linear: !ref <seq_lin>
    ctc_linear: !ref <ctc_lin>
    language_model: !ref <lm_model>
    bos_index: 0
    eos_index: 0
    blank_index: 0
    min_decode_ratio: 0.0
    max_decode_ratio: 1.0
    beam_size: 80
    eos_threshold: 1.5
    using_max_attn_shift: true
    max_attn_shift: 240
    coverage_penalty: 1.5
    lm_weight: 0.5
    ctc_weight: 0.0
    temperature: 1.25
    temperature_lm: 1.25

I also found that if I change the testing batch_size from 8 to 1, the WER can be reduced from 20.3% to 2.8%, which I believe should be the normal result. I am thus wondering whether the padding might be the main reason for this problem.

opened by Kuray107 31

LM decoder and training for TIMIT
Modifications:

Add length normalization for beam search.

Rename length penalty to length rewarding (beam search).

Integrate LM in the decoder.

Add recipe for LM and ASR with LM decoding.

work in progress ready to review
opened by jjery2243542 31
Can't train a model with multi NVIDIA RTX 3090 GPUs.

OS: Ubuntu 20.04 Python: I tested both 3.7 and 3.8 SpeechBrain: I tested 0.5.8 and 0.5.9 PyTorch: 1.7.0 for SpeechBrain 0.5.8 and 1.9.0 for SpeechBrain 0.5.9, both complied on CUDA 11.1 Recipe: speechbrain/recipes/LibriSpeech/ASR/transformer

command: python train.py hparams/transformer.yaml --data_folder xxx --data_parallel_backend

I have 8 3090 GPUs on my server. But when I watched nvidia-smi, there was only one GPU process running on one GPU, the rest of the 7 GPUs were idle. So how can I fix this problem? Thank you.

opened by Xinghui-Wu 28
MultiGPU + Librispeech
Adding Multi-GPU training to the Librispeech recipe.

Change the logging to info on the libri preparation. Without that, the user has NO feedback on what is happening, and it's actually weird.

Add multi GPU with data parallel to experiment.py

Add a multigpu param to the yaml file

To do: [x] Test the recipe on 1-2 GPU [x] Test that the checkpointing doesn't break due to DataParallel when going from one to two and two to one
enhancement ready to review
opened by TParcollet 27
[WIP] Streamable Voice Activity Detection
Integrate streamable Voice Activity Detection with script to run on laptop via ffmpeg.

Missing:

[ ] choose model, train and deploy on HF Hub;

[ ] test VAD_stream and perform last consistency checks;

[ ] update README.md
opened by fpaissan 0
[Bug]: Training hifigan on ljspeech results in FileNotFoundError for train.json

Describe the bug

When I start the hifigan training on ljspeech I get the error FileNotFoundError: [Errno 2] No such file or directory: './results/hifi_gan/1234/save/train.json'

I looked for the train.json and could not find it. I guess it should be created by the ljspeech_prepare.py script but it is not.

Expected behaviour

I expect the train.json to be created automatically when I start the training.

To Reproduce

No response

Versions

No response

Relevant log output

No response

Additional context

No response
bug

opened by padmalcom 4

[Bug]: Exporting Tacotron2 into onnx file

Describe the bug

Hello,

I am trying to to export tacotron2 into onnx file. Following the documentation of PyTorch, I have chosen to use script() function. Unfortunately, this does not work and shows me an error.

I am working with Python 3.9.15 using conda environment.

Please, can you tell if I am doing something wrong or if some calculations are not compatible with onnx exporting?

Best regards, Mathias.

Expected behaviour

I am expecting the generation of an onnx file when I am using torch.onnx.export.

To Reproduce

import torch
from speechbrain.pretrained import Tacotron2

tacotron2 = Tacotron2.from_hparams(source="speechbrain/tts-tacotron2-ljspeech", savedir="tmpdir_tts")
scriptModule = torch.jit.script(tacotron2)
torch.onnx.export(scriptModule, ["hello"], "tacotron2.onnx", verbose=True)

Versions

huggingface-hub==0.11.1 numpy==1.24.0 nvidia-cublas-cu11==11.10.3.66 nvidia-cuda-nvrtc-cu11==11.7.99 nvidia-cuda-runtime-cu11==11.7.99 nvidia-cudnn-cu11==8.5.0.96 PyYAML==6.0 scipy==1.9.3 speechbrain==0.5.13 torch==1.13.1 torchaudio==0.13.1

Relevant log output

Traceback (most recent call last):
  File "/home/mquillot/TTS_experiment/sb_experiment.py", line 26, in <module>
    scriptModule = torch.jit.script(tacotron2)
  File "/home/mquillot/speechbrain/lib/python3.9/site-packages/torch/jit/_script.py", line 1286, in script
    return torch.jit._recursive.create_script_module(
  File "/home/mquillot/speechbrain/lib/python3.9/site-packages/torch/jit/_recursive.py", line 476, in create_script_module
    return create_script_module_impl(nn_module, concrete_type, stubs_fn)
  File "/home/mquillot/speechbrain/lib/python3.9/site-packages/torch/jit/_recursive.py", line 538, in create_script_module_impl
    script_module = torch.jit.RecursiveScriptModule._construct(cpp_module, init_fn)
  File "/home/mquillot/speechbrain/lib/python3.9/site-packages/torch/jit/_script.py", line 615, in _construct
    init_fn(script_module)
  File "/home/mquillot/speechbrain/lib/python3.9/site-packages/torch/jit/_recursive.py", line 516, in init_fn
    scripted = create_script_module_impl(orig_value, sub_concrete_type, stubs_fn)
  File "/home/mquillot/speechbrain/lib/python3.9/site-packages/torch/jit/_recursive.py", line 538, in create_script_module_impl
    script_module = torch.jit.RecursiveScriptModule._construct(cpp_module, init_fn)
  File "/home/mquillot/speechbrain/lib/python3.9/site-packages/torch/jit/_script.py", line 615, in _construct
    init_fn(script_module)
  File "/home/mquillot/speechbrain/lib/python3.9/site-packages/torch/jit/_recursive.py", line 516, in init_fn
    scripted = create_script_module_impl(orig_value, sub_concrete_type, stubs_fn)
  File "/home/mquillot/speechbrain/lib/python3.9/site-packages/torch/jit/_recursive.py", line 538, in create_script_module_impl
    script_module = torch.jit.RecursiveScriptModule._construct(cpp_module, init_fn)
  File "/home/mquillot/speechbrain/lib/python3.9/site-packages/torch/jit/_script.py", line 615, in _construct
    init_fn(script_module)
  File "/home/mquillot/speechbrain/lib/python3.9/site-packages/torch/jit/_recursive.py", line 516, in init_fn
    scripted = create_script_module_impl(orig_value, sub_concrete_type, stubs_fn)
  File "/home/mquillot/speechbrain/lib/python3.9/site-packages/torch/jit/_recursive.py", line 542, in create_script_module_impl
    create_methods_and_properties_from_stubs(concrete_type, method_stubs, property_stubs)
  File "/home/mquillot/speechbrain/lib/python3.9/site-packages/torch/jit/_recursive.py", line 393, in create_methods_and_properties_from_stubs
    concrete_type._create_methods_and_properties(property_defs, property_rcbs, method_defs, method_rcbs, method_defaults)
RuntimeError: Unsupported value kind: Tensor

Additional context

No response

bug

opened by mquillot 1

[Bug]: Implementation of CumulativeLayerNorm

Describe the bug

The implementation of CumulativeLayerNorm seems to be channel (or time-frame)-wise normalization instead of accumulating the information on past frames.

Expected behaviour

"ChannelwiseLayerNorm (cLN)" as in ESPnet might be more accurate name.

To Reproduce

No response

Versions

No response

Relevant log output

No response

Additional context

No response
bug

opened by YoshikiMas 0

add whisper normalization on training

Hi :D

In the present whisper finetuning implementation, we are training with raw text (no normalisation) and then we do testing and validating using whisper normalisation. This is a little adjustment to fine-tune the model on whisper normalisation, as encode only accomplishes tokenisation and not normalisation.


from transformers.models.whisper.tokenization_whisper import WhisperTokenizer

test = "hello i have fifty two dollars"

tokenizer = WhisperTokenizer.from_pretrained("openai/whisper-base")
print(tokenizer._normalize(test))
print(tokenizer.decode(tokenizer.encode(test)))
print(tokenizer.decode(tokenizer.encode(tokenizer._normalize(test))))

#print outputs
hello i have $52
<|startoftranscript|><|notimestamps|>hello i have fifty two dollars<|endoftext|>
<|startoftranscript|><|notimestamps|>hello i have $52<|endoftext|>

opened by Moumeneb1 3

Releases(v0.5.13)

v0.5.13(Aug 29, 2022)
This is a minor release with better dependency version specification. We note that SpeechBrain is compatible with PyTorch 1.12, and the updated package reflects this. See the issue linked next to each commit for more details about the corresponding changes.

Commit summary

[edb7714]: Adding no_sync and on_fit_batch_end method to core (Rudolf Arseni Braun) #1449

[07155e9]: G2P fixes (flexthink) #1473

[6602dab]: fix for #1469, minimal testing for profiling (anautsch) #1476

[abbfab9]: test clean-ups: passes linters; doctests; unit & integration tests; load-yaml on cpu (anautsch) #1487

[1a16b41]: fix ddp incorrect command (=) #1498

[0b0ec9d]: using no_sync() in fit_batch() of core.py (Rudolf Arseni Braun) #1449

[5c9b833]: Remove torch maximum compatible version (Peter Plantinga) #1504

[d0f4352]: remove limit for HF hub as it does not work with colab (Titouan) #1508

[b78f6f8]: Add revision to hub (Titouan) #1510

[2c491a4]: fix transducer loss inputs devices (Adel Moumen) #1511

[4972f76]: missing space in install command (pehonnet) #1512

[6bc72af]: Fixing shuffle argument for distributed sampler in core.py (Rudolf Arseni Braun) #1518

[df7acd9]: Added the link for example results (cem) #1523

[5bae6df]: add LinearWarmupScheduler (Ge Li) #1537

[2edd7ee]: updating scipy version in requirements.txt. (Nauman Dawalatabad) #1546

Source code(tar.gz)
Source code(zip)
v0.5.12(Jun 26, 2022)
Release Notes - SpeechBrain v0.5.12

We worked very hard and we are very happy to announce the new version of SpeechBrain!

SpeechBrain 0.5.12 significantly expands the toolkit without introducing any major interface changes. I would like to warmly thank the many contributors that made this possible.

The main changes are the following:

A) Text-to-Speech: We developed the first TTS system of SpeechBrain. You can find it here. The system relies on Tacotron2 + HiFiGAN (as vocoder). The models coupled with an easy-inference interface are available on HuggingFace.

B) Grapheme-to-Phoneme (G2P): We developed an advanced Grapheme-to-Phoneme. You can find the code here. The current version significantly outperforms our previous model.

C) Speech Separation:

We developed a novel version of the SepFormer called Resource-Efficient SepFormer (RE-Sepformer). The code is available here and the pre-trained model (with an easy inference interface) here.

We released a recipe for Binaural speech separation with WSJMix. See the code here.

We released a new recipe with the AIShell mix dataset. You can see the code here.

D) Speech Enhancement:

We released the SepFormer model for speech enhancement. the code is here, while the pre-trained model (with easy-inference interface) is here.

We implemented the WideResNet for speech enhancement and use it to mimic loss-based speech enhancement. The code is here and the pretrained model (with easy-inference interface) is here.

E) Feature Front-ends:

We now support LEAF filter banks. The code is here. You can find an example of a recipe using it here.

We now support SincConv multichannel (see code here).

F) Recipe Refactors:

We refactored the Voxceleb recipe and fix the normalization issues. See the new code here. We also made the EER computation method less memory demanding (see here).

We refactored the IEMOCAP recipe for emotion recognition. See the new code here.

G) Models for African Languages: We now have recipes for the DVoice dataset. We currently support Darija, Swahili, Wolof, Fongbe, and Amharic. The code is available here. The pretrained model (coupled with an easy-inference interface) can be found on SpeechBrain-HuggingFace.

H) Profiler: We implemented a model profiler that helps users while developing new models with SpeechBrain. The profiler outputs a bunch of potentially useful information, such as the real-time factors and many other details. A tutorial is available here.

I) Tests: We significantly improved the tests. In particular, we introduced the following tests: HF_repo tests, docstring checks, yaml-script consistency, recipe tests, and check URLs. This will helps us scale up the project.

L) Other improvements:

We now support the torchaudio RNNT loss*.

We improved the relative attention mechanism of the Conformer.

We updated the transformer for LibriSpeech. This improves the performance from WER= 2.46% to 2.26% on the test-clean. See the code here.

The Environmental corruption module can now support different sampling rates.

Minor fixes.

Source code(tar.gz)
Source code(zip)
v0.5.11(Dec 20, 2021)
Dear users, We worked very hard, and we are very happy to announce the new version of SpeechBrain. SpeechBrain 0.5.11 further expands the toolkit without introducing any major interface change.

The main changes are the following:

We implemented new recipes, such as:

VoxLingua 107 for language identification.

Sepformer for speech enhancement

MetricGAN-U for speech enhancement

SLURP with wav2vec for spoken language understanding.

REALM for speech separation with real data.

Korean Speech Recognition with KsponSpeech.

CommonVoice for German.

IEMOCAP for language emotion recognition using wav2vec.

Support for Dynamic batching with a Tutorial to help users familiarize themselves with it.

Support for wav2vec training within SpeechBrain.

Developed an interface with Orion for hyperparameter tuning with a Tutorial to help users familiarize themselves with it.

the torchaudio transducer loss is now supported. We also kept our numba implementation to help users customize the transducer loss part if needed.

Improved CTC-Segmentation

Fixed minor bugs and issues (e.g., fixed MVDR beamformer ).

Let me thank all the amazing contributors for this achievement. Please, keep add a star to our project if you appreciate our effort for the community. Together, we are growing very fast, and we have big plans for the future.

Stay Tuned!
Source code(tar.gz)
Source code(zip)
0.5.10(Sep 11, 2021)
This version mainly expands the functionalities of SpeechBrain without adding any backward incompatibilities.

New Recipes:

Language Identification with CommonLanguage

EEG signal processing with ERPCore

Speech translation with Fisher-Call Home

Emotion Recognition with IEMOCAP

Voice Activity Detection with LibriParty

ASR with LibriSpeech wav2vec (WER=1.9 on test-clean)

SpeechEnhancement with CoopNet

SpeechEnhancement with SEGAN

Speech Separation with LibriMix, WHAM, and WHAMR

Support for guided attention

Spoken Language Understanding with SLURP

Beyond that, we fixed some minor bugs and issues.
Source code(tar.gz)
Source code(zip)
v0.5.9(Jun 17, 2021)
This main differences with the previous version are the following:

Added Wham/whamr/librimix for speech separation

Compatibility with PyTorch 1.9

Fixed minor bugs

Added SpeechBrain paper

Source code(tar.gz)
Source code(zip)
v0.5.8(Jun 6, 2021)
SpeechBrain 0.5.8 improves the previous version in the following way:

Added wav2vec support in TIMIT, CommonVoice, AISHELL-1

Improved Fluent Speech Command Recipe

Improved SLU recipes

Recipe for UrbanSound8k

Fix small bugs

Fix typos

Source code(tar.gz)
Source code(zip)
0.5.7(Apr 29, 2021)
SpeechBrain is an open-source and all-in-one speech toolkit. It is designed to be simple, extremely flexible, and user-friendly. Competitive or state-of-the-art performance is obtained in various domains. The current version (v0.5.7) supports:

E2E Speech Recognition

Speaker Recognition (Identification and Verification)

Spoken Language Understanding (e.g., Intent recognition)

Speaker Diarization

Speech Enhancement

Speech Separation

Multi-microphone signal processing (beamforming, localization)

Many other tasks will be supported soon. Take a look into our roadmap on Discourse. Your contribution is welcome! Please, star our project to help us growing.

For more info and tutorials: https://speechbrain.github.io/
Source code(tar.gz)
Source code(zip)

SpeechBrain is an open-source and all-in-one speech toolkit based on PyTorch.

Related tags

Overview

The SpeechBrain Toolkit

Key features

Speech recognition

Feature extraction and augmentation

Speaker recognition, identification and diarization

Speech Translation

Speech enhancement and separation

Multi-microphone processing

Performance

Pretrained Models

Documentation & Tutorials

Under development

Quick installation

Install via PyPI

Install with GitHub

Test Installation

Running an experiment

SpeechBrain Roadmap

Learning SpeechBrain

License

Citing SpeechBrain

Comments

Describe the bug

Expected behaviour

To Reproduce

Versions

Relevant log output

Additional context

Describe the bug

Expected behaviour

To Reproduce

Versions

Relevant log output

Additional context

Describe the bug

Expected behaviour

To Reproduce

Versions

Relevant log output

Additional context

Releases(v0.5.13)

v0.5.13(Aug 29, 2022)

Commit summary

v0.5.12(Jun 26, 2022)

Release Notes - SpeechBrain v0.5.12

v0.5.11(Dec 20, 2021)

0.5.10(Sep 11, 2021)

v0.5.9(Jun 17, 2021)

v0.5.8(Jun 6, 2021)

0.5.7(Apr 29, 2021)

Owner

SpeechBrain

Optimizing synthesizer parameters using gradient approximation

Deep Surface Reconstruction from Point Clouds with Visibility Information

This is a project based on ConvNets used to identify whether a road is clean or dirty. We have used MobileNet as our base architecture and the weights are based on imagenet.

Simple STAC Catalogs discovery tool.

Image-to-Image Translation in PyTorch

OpenMMLab Image Classification Toolbox and Benchmark

SiT: Self-supervised vIsion Transformer

Tensorflow 2.x implementation of Vision-Transformer model

(Py)TOD: Tensor-based Outlier Detection, A General GPU-Accelerated Framework

Unofficial implementation of Google's FNet: Mixing Tokens with Fourier Transforms

Visual dialog agents with pre-trained vision-and-language encoders.

Sparse R-CNN: End-to-End Object Detection with Learnable Proposals, CVPR2021

the code for our CVPR 2021 paper Bilateral Grid Learning for Stereo Matching Network [BGNet]

Platform-agnostic AI Framework 🔥

Pytorch implementation of "Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech"

A Tensorflow based library for Time Series Modelling with Gaussian Processes

[CVPR'22] Official PyTorch Implementation of Collaborative Transformers for Grounded Situation Recognition

Hide screen when boss is approaching.

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

A python library to artfully visualize Factorio Blueprints and an interactive web demo for using it.