Speech Recognition for Uyghur using Speech transformer

Last update: Nov 17, 2022

Overview

Speech Recognition for Uyghur using Speech transformer

Training:

this model using CTC loss and Cross Entropy loss for training.

unzip results.7z and thuyg20_data.7z to the same folder where python source files located. then run:

python train.py

Recognition:

for recognition download only pretrained model. then run:

python .\tonu.py .\test6.wav

result will be:

        Model loaded: results/UFormer_last.pth
            Best CER: 4.16%
             Trained: 276 epochs
The model has 36,418,306 trainable parameters
 Feature  has 25,869,058 trainable parameters
  Encoder has 4,205,568 trainable parameters
  Decoder has 6,343,680 trainable parameters

======================
Recognizing file .\test6.wav
test6.wav -> u qizlarning resimi chiqip qalsa bilekchila sinchilap qaraytti

This project using

A free Uyghur speech database Released by [email protected] University & Xinjiang University

Reference

https://github.com/gentaiscool/end2end-asr-pytorch

pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.

The PyTorch-Kaldi Speech Recognition Toolkit PyTorch-Kaldi is an open-source repository for developing state-of-the-art DNN/HMM speech recognition sys

2.3k Dec 27, 2022

Espresso: A Fast End-to-End Neural Speech Recognition Toolkit

Espresso Espresso is an open-source, modular, extensible end-to-end neural automatic speech recognition (ASR) toolkit based on the deep learning libra

919 Jan 3, 2023

Modular and extensible speech recognition library leveraging pytorch-lightning and hydra.

Lightning ASR Modular and extensible speech recognition library leveraging pytorch-lightning and hydra What is Lightning ASR • Installation • Get Star

40 Sep 19, 2022

voice2json is a collection of command-line tools for offline speech/intent recognition on Linux

Command-line tools for speech and intent recognition on Linux

988 Jan 4, 2023

Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.

OpenSpeech provides reference implementations of various ASR modeling papers and three languages recipe to perform tasks on automatic speech recogniti

26 Dec 14, 2022

Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.

OpenSpeech provides reference implementations of various ASR modeling papers and three languages recipe to perform tasks on automatic speech recogniti

86 Jun 11, 2021

A fast and lightweight python-based CTC beam search decoder for speech recognition.

pyctcdecode A fast and feature-rich CTC beam search decoder for speech recognition written in Python, providing n-gram (kenlm) language model support

315 Dec 21, 2022

Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.

🤗 Contributing to OpenSpeech 🤗 OpenSpeech provides reference implementations of various ASR modeling papers and three languages recipe to perform ta

513 Jan 3, 2023

ExKaldi-RT: An Online Speech Recognition Extension Toolkit of Kaldi

ExKaldi-RT is an online ASR toolkit for Python language. It reads realtime streaming audio and do online feature extraction, probability computation, and online decoding.

31 Aug 16, 2021

Comments

W2Llayer

Dear Gheyret, Thanks for your work.

I spent some time today to try to figure out the source of this feature extraction layer, can you point me the paper/any reference on it?

I think it is a great design to extract speech features, so just want to understand it more deeply,

Thanks a lot,

Kelvin

opened by kelvinqin 2

Releases(premodel)

premodel(Jun 18, 2021)

Pretrained model.
Source code(tar.gz)
Source code(zip)
results.7z(131.19 MB)

Owner

Uyghur

GitHub Repository

基于“Seq2Seq+前缀树”的知识图谱问答

KgCLUE-bert4keras 基于“Seq2Seq+前缀树”的知识图谱问答简介博客：https://kexue.fm/archives/8802 环境软件：bert4keras=0.10.8 硬件：目前的结果是用一张Titan RTX（24G）跑出来的。运行第一次运行的时候，会给知

65 Dec 12, 2022

Code for text augmentation method leveraging large-scale language models

HyperMix Code for our paper GPT3Mix and conducting classification experiments using GPT-3 prompt-based data augmentation. Getting Started Installing P

47 Dec 20, 2022

leaking paid token generator that was a shit lmao for 100$ haha

Discord-Token-Generator-Leaked leaking paid token generator that was a shit lmao for 100$ he selling it for 100$ wth here the code enjoy don't forget

5 Apr 15, 2022

A minimal code for fairseq vq-wav2vec model inference.

vq-wav2vec inference A minimal code for fairseq vq-wav2vec model inference. Runs without installing the fairseq toolkit and its dependencies. Usage ex

7 Nov 15, 2022

使用pytorch+transformers复现了SimCSE论文中的有监督训练和无监督训练方法

SimCSE复现项目描述 SimCSE是一种简单但是很巧妙的NLP对比学习方法，创新性地引入Dropout的方式，对样本添加噪声，从而达到对正样本增强的目的。该框架的训练目的为：对于batch中的每个样本，拉近其与正样本之间的距离，拉远其与负样本之间的距离，使得模型能够在大规模无监督语料（也可以

58 Dec 20, 2022

Klexikon: A German Dataset for Joint Summarization and Simplification

Klexikon: A German Dataset for Joint Summarization and Simplification Dennis Aumiller and Michael Gertz Heidelberg University Under submission at LREC

8 Jan 03, 2023

SNCSE: Contrastive Learning for Unsupervised Sentence Embedding with Soft Negative Samples

SNCSE SNCSE: Contrastive Learning for Unsupervised Sentence Embedding with Soft Negative Samples This is the repository for SNCSE. SNCSE aims to allev

59 Jan 02, 2023

Contains descriptions and code of the mini-projects developed in various programming languages

TexttoSpeechAndLanguageTranslator-project introduction A pleasant application where the client will be given buttons like play,reset and exit. The cli

1 Dec 22, 2021

Transformers Wav2Vec2 + Parlance's CTCDecodeTransformers Wav2Vec2 + Parlance's CTCDecode

🤗 Transformers Wav2Vec2 + Parlance's CTCDecode Introduction This repo shows how 🤗 Transformers can be used in combination with Parlance's ctcdecode

9 Jul 21, 2022

Syntax-aware Multi-spans Generation for Reading Comprehension (TASLP 2022)

SyntaxGen Syntax-aware Multi-spans Generation for Reading Comprehension (TASLP 2022) In this repo, we upload all the scripts for this work. Due to siz

3 Jun 13, 2022

Machine learning classifiers to predict American Sign Language .

ASL-Classifiers American Sign Language (ASL) is a natural language that serves as the predominant sign language of Deaf communities in the United Stat

0 Feb 08, 2022

Toward Model Interpretability in Medical NLP

Toward Model Interpretability in Medical NLP LING380: Topics in Computational Linguistics Final Project James Cross ( 1 Mar 04, 2022

Research code for ECCV 2020 paper "UNITER: UNiversal Image-TExt Representation Learning"

UNITER: UNiversal Image-TExt Representation Learning This is the official repository of UNITER (ECCV 2020). This repository currently supports finetun

680 Dec 24, 2022

Use Google's BERT for named entity recognition （CoNLL-2003 as the dataset）.

For better performance, you can try NLPGNN, see NLPGNN for more details. BERT-NER Version 2 Use Google's BERT for named entity recognition （CoNLL-2003

1.2k Dec 26, 2022

A python wrapper around the ZPar parser for English.

NOTE This project is no longer under active development since there are now really nice pure Python parsers such as Stanza and Spacy. The repository w

49 Sep 12, 2022

Ongoing research training transformer language models at scale, including: BERT & GPT-2

Megatron (1 and 2) is a large, powerful transformer developed by the Applied Deep Learning Research team at NVIDIA.

3.5k Dec 30, 2022

A python gui program to generate reddit text to speech videos from the id of any post.

Reddit text to speech generator A python gui program to generate reddit text to speech videos from the id of any post. Current functionality Generate

17 Dec 19, 2022

The official code for “DocTr: Document Image Transformer for Geometric Unwarping and Illumination Correction”, ACM MM, Oral Paper, 2021.

Good news! Our new work exhibits state-of-the-art performances on DocUNet benchmark dataset: DocScanner: Robust Document Image Rectification with Prog

231 Dec 26, 2022

숭실대학교 컴퓨터학부 전공종합설계프로젝트

✨ 시각장애인을 위한 버스도착 알림 장치 ✨ 👀 개요 현대 사회에서 대중교통 위치 정보를 이용하여 사람들이 간단하게 이용할 대중교통의 정보를 얻고 쉽게 대중교통을 이용할 수 있다. 해당 정보는 각종 어플리케이션과 대중교통 이용시설에서 위치 정보를 제공하고 있지만 시각

3 Jan 25, 2022

Hostapd-mac-tod-acl - Setup a hostapd AP with MAC ToD ACL

A brief explanation This script provides a quick way to setup a Time-of-day (Tod

2 Feb 03, 2022

Speech Recognition for Uyghur using Speech transformer

Related tags

Overview

Speech Recognition for Uyghur using Speech transformer

Training:

Recognition:

This project using

Reference

You might also like...

pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.

Espresso: A Fast End-to-End Neural Speech Recognition Toolkit

Modular and extensible speech recognition library leveraging pytorch-lightning and hydra.

voice2json is a collection of command-line tools for offline speech/intent recognition on Linux

Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.

Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.

A fast and lightweight python-based CTC beam search decoder for speech recognition.

Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.

ExKaldi-RT: An Online Speech Recognition Extension Toolkit of Kaldi

Comments

W2Llayer

Releases(premodel)

premodel(Jun 18, 2021)

Owner

Uyghur

基于“Seq2Seq+前缀树”的知识图谱问答

Code for text augmentation method leveraging large-scale language models

leaking paid token generator that was a shit lmao for 100$ haha

A minimal code for fairseq vq-wav2vec model inference.

使用pytorch+transformers复现了SimCSE论文中的有监督训练和无监督训练方法

Klexikon: A German Dataset for Joint Summarization and Simplification

SNCSE: Contrastive Learning for Unsupervised Sentence Embedding with Soft Negative Samples

Contains descriptions and code of the mini-projects developed in various programming languages

Transformers Wav2Vec2 + Parlance's CTCDecodeTransformers Wav2Vec2 + Parlance's CTCDecode

Syntax-aware Multi-spans Generation for Reading Comprehension (TASLP 2022)

Machine learning classifiers to predict American Sign Language .

Toward Model Interpretability in Medical NLP

Research code for ECCV 2020 paper "UNITER: UNiversal Image-TExt Representation Learning"

Use Google's BERT for named entity recognition （CoNLL-2003 as the dataset）.

A python wrapper around the ZPar parser for English.

Ongoing research training transformer language models at scale, including: BERT & GPT-2

A python gui program to generate reddit text to speech videos from the id of any post.

The official code for “DocTr: Document Image Transformer for Geometric Unwarping and Illumination Correction”, ACM MM, Oral Paper, 2021.

숭실대학교 컴퓨터학부 전공종합설계프로젝트

Hostapd-mac-tod-acl - Setup a hostapd AP with MAC ToD ACL