Persian Kaldi profile for Rhasspy built from open speech data

Overview

Persian Kaldi Profile

A Rhasspy profile for Persian (fa).

Installation

Get started by first installing Vosk:

# Create virtual environment
python3 -m venv .venv
source .venv/bin/activate
pip3 install --upgrade pip
pip3 install --upgrade wheel setuptools

# Install Vosk
pip3 install vosk

Next, download the model and extract it:

wget 'https://github.com/rhasspy/fa_kaldi-rhasspy/releases/download/v1.0/vosk-model-small-fa-rhasspy-0.15.zip'
unzip vosk-model-small-fa-rhasspy-0.15.zip

Finally, run the transcribe.py Python program with the model and an audio file:

python3 transcribe.py vosk-model-small-fa-rhasspy-0.15 welcome.wav

{"result": [{"conf": 1.0, "end": 0.48, "start": 0.06, "word": "خوش"}, {"conf": 1.0, "end": 1.11, "start": 0.48, "word": "آمدید"}], "text": "خوش آمدید"}

For each audio file given to transcribe.py, a line of JSON will be printed in the output with the transcription details.

You might also like...
Service for working with open data of the State Duma of the Russian Federation
Service for working with open data of the State Duma of the Russian Federation

Сервис для работы с открытыми данными Госдумы РФ Исходные данные из API Госдумы РФ извлекаются с помощью Apache Nifi и приземляются в хранилище Clickh

Driving lessons made simpler. Custom scheduling API built with Python.
Driving lessons made simpler. Custom scheduling API built with Python.

NOTE This is a mirror of a GitLab repository. Dryvo Dryvo is a unique solution for the driving lessons industry. Our aim is to save the teacher’s time

Ikaros is a free financial library built in pure python that can be used to get information for single stocks, generate signals and build prortfolios

Ikaros is a free financial library built in pure python that can be used to get information for single stocks, generate signals and build prortfolios

This repository contains Python Projects for Beginners as well as for Intermediate Developers built by Contributors.
This repository contains Python Projects for Beginners as well as for Intermediate Developers built by Contributors.

Python Projects {Open Source} Introduction The repository was built with a tree-like structure in mind, it contains collections of Python Projects. Mo

Here, I have discuss the three methods of list reversion. The three methods are built-in method, slicing method and position changing method.

Three-different-method-for-list-reversion Here, I have discuss the three methods of list reversion. The three methods are built-in method, slicing met

Dot Browser is a privacy-conscious web browser with smarts built-in for protection against trackers and advertisments online.
Dot Browser is a privacy-conscious web browser with smarts built-in for protection against trackers and advertisments online.

🌍 Take back your privacy with Dot Browser, the privacy-conscious web browser that protects you from being tracked and monitored online.

Built with Python programming language and QT library and Guess the number in three easy, medium and hard rolls
Built with Python programming language and QT library and Guess the number in three easy, medium and hard rolls

guess-the-numbers Built with Python programming language and QT library and Guess the number in three easy, medium and hard rolls Number guessing game

Built with Python programming language and QT library and Guess the number in three easy, medium and hard rolls
Built with Python programming language and QT library and Guess the number in three easy, medium and hard rolls

password-generator Built with Python programming language and QT library and Guess the number in three easy, medium and hard rolls Password generator

Comments
  •  PySoundFile failed. Trying audioread instead.

    PySoundFile failed. Trying audioread instead.

    I just tried to run this command: python3 transcribe.py vosk-model-small-fa-rhasspy-0.15 MyFile.mp3

    and got this error:

    /your/path/.venv/lib/python3.9/site-packages/librosa/util/decorators.py:88: UserWarning: PySoundFile failed. Trying audioread instead.
      return f(*args, **kwargs)  
    

    Thank you so much

    opened by GameO7er 1
  • ModuleNotFoundError: No module named 'librosa'

    ModuleNotFoundError: No module named 'librosa'

    I got this error when I just did follow your instruction in the Readme.md line by line. So I thought maybe this help others for running the script successfully.

    Traceback (most recent call last):
      File "/home/gameover/Projects/Python/Rhaspy/transcribe.py", line 8, in <module>
        import librosa
    ModuleNotFoundError: No module named 'librosa'
    

    Thank you so much.

    opened by GameO7er 1
  • ModuleNotFoundError: No module named 'numpy'

    ModuleNotFoundError: No module named 'numpy'

    I got this error when I just did follow your instruction in the Readme.md line by line. So I thought maybe this help others for running the script successfully.

    Traceback (most recent call last):
      File "/home/gameover/Projects/Python/Rhaspy/transcribe.py", line 8, in <module>
        import librosa
    ModuleNotFoundError: No module named 'numpy'
    

    Thank you so much.

    opened by GameO7er 1
  • Error using recipes

    Error using recipes

    Hello, Thanks for you great work for sharing this useful repo. I tried to use your recipes to train Persian data. In run.sh file, an error ocurred while adapting lm.arpa and creating G.fst:

    creating G.fst...
    arpa2fst -
    LOG (arpa2fst[5.5.0~1-2b62]:Read():arpa-file-parser.cc:94) Reading \data\ section.
    LOG (arpa2fst[5.5.0~1-2b62]:Read():arpa-file-parser.cc:149) Reading \1-grams: section.
    LOG (arpa2fst[5.5.0~1-2b62]:Read():arpa-file-parser.cc:149) Reading \2-grams: section.
    LOG (arpa2fst[5.5.0~1-2b62]:Read():arpa-file-parser.cc:149) Reading \3-grams: section.
    FATAL: FstCompiler: Bad number of columns, source = standard input, line = 28129
    ERROR: FstHeader::Read: Bad FST header: standard input
    

    full run.sh output is:

    Runtime configuration is: nJobs 12, nDecodeJobs 12. If this is not what you want, edit cmd.sh
    Starting at stage 0, train_stage -10
    
    Prepare phoneme data for Kaldi
    
    utils/prepare_lang.sh data/local/dict <unk> data/local/lang data/lang
    Checking data/local/dict/silence_phones.txt ...
    --> reading data/local/dict/silence_phones.txt
    --> text seems to be UTF-8 or ASCII, checking whitespaces
    --> text contains only allowed whitespaces
    --> data/local/dict/silence_phones.txt is OK
    
    Checking data/local/dict/optional_silence.txt ...
    --> reading data/local/dict/optional_silence.txt
    --> text seems to be UTF-8 or ASCII, checking whitespaces
    --> text contains only allowed whitespaces
    --> data/local/dict/optional_silence.txt is OK
    
    Checking data/local/dict/nonsilence_phones.txt ...
    --> reading data/local/dict/nonsilence_phones.txt
    --> text seems to be UTF-8 or ASCII, checking whitespaces
    --> text contains only allowed whitespaces
    --> data/local/dict/nonsilence_phones.txt is OK
    
    Checking disjoint: silence_phones.txt, nonsilence_phones.txt
    --> disjoint property is OK.
    
    Checking data/local/dict/lexicon.txt
    --> reading data/local/dict/lexicon.txt
    --> text seems to be UTF-8 or ASCII, checking whitespaces
    --> text contains only allowed whitespaces
    --> data/local/dict/lexicon.txt is OK
    
    Checking data/local/dict/extra_questions.txt ...
    --> reading data/local/dict/extra_questions.txt
    --> text seems to be UTF-8 or ASCII, checking whitespaces
    --> text contains only allowed whitespaces
    --> data/local/dict/extra_questions.txt is OK
    --> SUCCESS [validating dictionary directory data/local/dict]
    
    **Creating data/local/dict/lexiconp.txt from data/local/dict/lexicon.txt
    fstaddselfloops data/lang/phones/wdisambig_phones.int data/lang/phones/wdisambig_words.int
    prepare_lang.sh: validating output directory
    utils/validate_lang.pl data/lang
    Checking existence of separator file
    separator file data/lang/subword_separator.txt is empty or does not exist, deal in word case.
    Checking data/lang/phones.txt ...
    --> text seems to be UTF-8 or ASCII, checking whitespaces
    --> text contains only allowed whitespaces
    --> data/lang/phones.txt is OK
    
    Checking words.txt: #0 ...
    --> text seems to be UTF-8 or ASCII, checking whitespaces
    --> text contains only allowed whitespaces
    --> data/lang/words.txt is OK
    
    Checking disjoint: silence.txt, nonsilence.txt, disambig.txt ...
    --> silence.txt and nonsilence.txt are disjoint
    --> silence.txt and disambig.txt are disjoint
    --> disambig.txt and nonsilence.txt are disjoint
    --> disjoint property is OK
    
    Checking sumation: silence.txt, nonsilence.txt, disambig.txt ...
    --> found no unexplainable phones in phones.txt
    
    Checking data/lang/phones/context_indep.{txt, int, csl} ...
    --> text seems to be UTF-8 or ASCII, checking whitespaces
    --> text contains only allowed whitespaces
    --> 15 entry/entries in data/lang/phones/context_indep.txt
    --> data/lang/phones/context_indep.int corresponds to data/lang/phones/context_indep.txt
    --> data/lang/phones/context_indep.csl corresponds to data/lang/phones/context_indep.txt
    --> data/lang/phones/context_indep.{txt, int, csl} are OK
    
    Checking data/lang/phones/nonsilence.{txt, int, csl} ...
    --> text seems to be UTF-8 or ASCII, checking whitespaces
    --> text contains only allowed whitespaces
    --> 116 entry/entries in data/lang/phones/nonsilence.txt
    --> data/lang/phones/nonsilence.int corresponds to data/lang/phones/nonsilence.txt
    --> data/lang/phones/nonsilence.csl corresponds to data/lang/phones/nonsilence.txt
    --> data/lang/phones/nonsilence.{txt, int, csl} are OK
    
    Checking data/lang/phones/silence.{txt, int, csl} ...
    --> text seems to be UTF-8 or ASCII, checking whitespaces
    --> text contains only allowed whitespaces
    --> 15 entry/entries in data/lang/phones/silence.txt
    --> data/lang/phones/silence.int corresponds to data/lang/phones/silence.txt
    --> data/lang/phones/silence.csl corresponds to data/lang/phones/silence.txt
    --> data/lang/phones/silence.{txt, int, csl} are OK
    
    Checking data/lang/phones/optional_silence.{txt, int, csl} ...
    --> text seems to be UTF-8 or ASCII, checking whitespaces
    --> text contains only allowed whitespaces
    --> 1 entry/entries in data/lang/phones/optional_silence.txt
    --> data/lang/phones/optional_silence.int corresponds to data/lang/phones/optional_silence.txt
    --> data/lang/phones/optional_silence.csl corresponds to data/lang/phones/optional_silence.txt
    --> data/lang/phones/optional_silence.{txt, int, csl} are OK
    
    Checking data/lang/phones/disambig.{txt, int, csl} ...
    --> text seems to be UTF-8 or ASCII, checking whitespaces
    --> text contains only allowed whitespaces
    --> 14 entry/entries in data/lang/phones/disambig.txt
    --> data/lang/phones/disambig.int corresponds to data/lang/phones/disambig.txt
    --> data/lang/phones/disambig.csl corresponds to data/lang/phones/disambig.txt
    --> data/lang/phones/disambig.{txt, int, csl} are OK
    
    Checking data/lang/phones/roots.{txt, int} ...
    --> text seems to be UTF-8 or ASCII, checking whitespaces
    --> text contains only allowed whitespaces
    --> 32 entry/entries in data/lang/phones/roots.txt
    --> data/lang/phones/roots.int corresponds to data/lang/phones/roots.txt
    --> data/lang/phones/roots.{txt, int} are OK
    
    Checking data/lang/phones/sets.{txt, int} ...
    --> text seems to be UTF-8 or ASCII, checking whitespaces
    --> text contains only allowed whitespaces
    --> 32 entry/entries in data/lang/phones/sets.txt
    --> data/lang/phones/sets.int corresponds to data/lang/phones/sets.txt
    --> data/lang/phones/sets.{txt, int} are OK
    
    Checking data/lang/phones/extra_questions.{txt, int} ...
    --> text seems to be UTF-8 or ASCII, checking whitespaces
    --> text contains only allowed whitespaces
    --> 11 entry/entries in data/lang/phones/extra_questions.txt
    --> data/lang/phones/extra_questions.int corresponds to data/lang/phones/extra_questions.txt
    --> data/lang/phones/extra_questions.{txt, int} are OK
    
    Checking data/lang/phones/word_boundary.{txt, int} ...
    --> text seems to be UTF-8 or ASCII, checking whitespaces
    --> text contains only allowed whitespaces
    --> 131 entry/entries in data/lang/phones/word_boundary.txt
    --> data/lang/phones/word_boundary.int corresponds to data/lang/phones/word_boundary.txt
    --> data/lang/phones/word_boundary.{txt, int} are OK
    
    Checking optional_silence.txt ...
    --> reading data/lang/phones/optional_silence.txt
    --> data/lang/phones/optional_silence.txt is OK
    
    Checking disambiguation symbols: #0 and #1
    --> data/lang/phones/disambig.txt has "#0" and "#1"
    --> data/lang/phones/disambig.txt is OK
    
    Checking topo ...
    
    Checking word_boundary.txt: silence.txt, nonsilence.txt, disambig.txt ...
    --> data/lang/phones/word_boundary.txt doesn't include disambiguation symbols
    --> data/lang/phones/word_boundary.txt is the union of nonsilence.txt and silence.txt
    --> data/lang/phones/word_boundary.txt is OK
    
    Checking word-level disambiguation symbols...
    --> data/lang/phones/wdisambig.txt exists (newer prepare_lang.sh)
    Checking word_boundary.int and disambig.int
    --> generating a 35 word/subword sequence
    --> resulting phone sequence from L.fst corresponds to the word sequence
    --> L.fst is OK
    --> generating a 45 word/subword sequence
    --> resulting phone sequence from L_disambig.fst corresponds to the word sequence
    --> L_disambig.fst is OK
    
    Checking data/lang/oov.{txt, int} ...
    --> text seems to be UTF-8 or ASCII, checking whitespaces
    --> text contains only allowed whitespaces
    --> 1 entry/entries in data/lang/oov.txt
    --> data/lang/oov.int corresponds to data/lang/oov.txt
    --> data/lang/oov.{txt, int} are OK
    
    --> data/lang/L.fst is olabel sorted
    --> data/lang/L_disambig.fst is olabel sorted
    --> SUCCESS [validating lang directory data/lang]
    
    adapt our LM for kaldi...
    
    
    creating G.fst...
    arpa2fst -
    LOG (arpa2fst[5.5.0~1-2b62]:Read():arpa-file-parser.cc:94) Reading \data\ section.
    LOG (arpa2fst[5.5.0~1-2b62]:Read():arpa-file-parser.cc:149) Reading \1-grams: section.
    LOG (arpa2fst[5.5.0~1-2b62]:Read():arpa-file-parser.cc:149) Reading \2-grams: section.
    LOG (arpa2fst[5.5.0~1-2b62]:Read():arpa-file-parser.cc:149) Reading \3-grams: section.
    FATAL: FstCompiler: Bad number of columns, source = standard input, line = 28129
    ERROR: FstHeader::Read: Bad FST header: standard input
    
    make mfcc
    
    fix_data_dir.sh: kept all 12394 utterances.
    fix_data_dir.sh: old files are kept in data/train/.backup
    mkdir: cannot create directory 'data/train/wav.scp': File exists
    steps/make_mfcc.sh --cmd utils/run.pl --nj 12 data/train exp/make_mfcc_chain/train mfcc_chain
    utils/validate_data_dir.sh: Successfully validated data-directory data/train
    steps/make_mfcc.sh: [info]: no segments file exists: assuming wav.scp indexed by utterance.
    

    can you please help me fix this issue? thanks

    opened by MahdiEsrafili 0
Owner
Rhasspy
Offline voice assistant
Rhasspy
WildHack 2021 solution by Nuclear Foxes team (public version).

WildHack 2021 Nuclear Foxes Team This repo contains our project for the Wildberries Hackathon 2021. Task 2: Searching tags Implement an algorithm of r

Sergey Zakharov 1 Apr 18, 2022
NeoInterface - Neo4j made easy for Python programmers!

Neointerface - Neo4j made easy for Python programmers! A Python interface to use the Neo4j graph database, and simplify its use. class NeoInterface: C

15 Dec 15, 2022
KeyBrowser: A program launches a browser and a keylogger at the same time, is used to retrieve a person's personal information

KeyBrowser: A program launches a browser and a keylogger at the same time, is used to retrieve a person's personal information

3 Oct 16, 2022
Persistent/Immutable/Functional data structures for Python

Pyrsistent Pyrsistent is a number of persistent collections (by some referred to as functional data structures). Persistent in the sense that they are

Tobias Gustafsson 1.8k Dec 31, 2022
Cairo hooks for pre-commit

pre-commit-cairo Cairo hooks for pre-commit. See pre-commit for more details Using pre-commit-cairo with pre-commit Add this to your .pre-commit-confi

Fran Algaba 16 Sep 21, 2022
Lightweight Scheduled Blocks Checker for Current Epoch. No cardano-node Required, data is taken from blockfrost.io

ReLeaderLogs For Cardano Stakepool Operators: Lightweight Scheduled Blocks Checker for Current Epoch. No cardano-node Required, data is taken from blo

SNAKE (Cardano Stakepool) 2 Oct 19, 2021
Blender-3D-SH-Dma-plugin - Import and export Sonic Heroes Delta Morph animations (.anm) into Blender 3D

io_scene_sonic_heroes_dma This plugin for Blender 3D allows you to import and ex

Psycrow 3 Mar 22, 2022
Материалы для курса VK Углубленный Python, весна 2022

VK Углубленный Python, весна 2022 Материалы для курса VK Углубленный Python, весна 2022 Лекции и материалы (слайды, домашки, код с занятий) Введение,

10 Nov 02, 2022
Uma moeda simples e segura!

SecCoin - Documentação A SecCoin foi criada com intuito de ser uma moeda segura, de fácil investimento e mineração. A Criptomoeda está na sua primeira

Sec-Coin Team 5 Dec 09, 2022
A curated list of awesome things related to Pydantic! 🌪️

Awesome Pydantic A curated list of awesome things related to Pydantic. These packages have not been vetted or approved by the pydantic team. Feel free

Marcelo Trylesinski 186 Jan 05, 2023
Implemented Exploratory Data Analysis (EDA) using Python.Built a dashboard in Tableau and found that 45.87% of People suffer from heart disease.

Heart_Disease_Diagnostic_Analysis Objective 🎯 The aim of this project is to use the given data and perform ETL and data analysis to infer key metrics

Sultan Shaikh 4 Jan 28, 2022
A simple python project that can find Tangkeke in a given image.

A simple python project that can find Tangkeke in a given image. Make the real Tangkeke image as a kernel to convolute the target image. The area wher

张志衡 1 Dec 08, 2021
A random cat fact python module

A random cat fact python module

Fayas Noushad 4 Nov 28, 2021
Beginner Projects A couple of beginner projects here

Beginner Projects A couple of beginner projects here, listed from easiest to hardest :) selector.py: simply a random selector to tell me who to faceti

Kylie 272 Jan 07, 2023
Hashcrack: Hash Bruteforse tool using python

HashCrack Hash Bruteforse tool Usage hashcrack.py -n 6 -c lower -l 5 -a md5 -t 3

Lev 1 May 04, 2022
Stock Monitoring

Stock Monitoring Description It is a stock monitoring script. This repository is still under developing. Getting Started Prerequisites & Installing pi

Sission 1 Feb 03, 2022
Very efficient backup system based on the git packfile format, providing fast incremental saves and global deduplication

Very efficient backup system based on the git packfile format, providing fast incremental saves and global deduplication (among and within files, including virtual machine images). Current release is

bup 6.9k Dec 27, 2022
TimeWizard - A script that generates every single Time Wizard EDOPRO lflist possible

EDOPRO F&L list generator This project is just a script that generates every sin

Diamond Dude 2 Sep 28, 2022
SmartGrid - Een poging tot een optimale SmartGrid oplossing, door Dirk Kuiper & Lars Zwaan

SmartGrid - Een poging tot een optimale SmartGrid oplossing, door Dirk Kuiper & Lars Zwaan

1 Jan 12, 2022
A student information management system in Python

Student-information-management-system 本项目是一个学生信息管理系统,这个项目是用Python语言实现的,也实现了图形化界面的显示,同时也实现了管理员端,学生端两个登陆入口,同时底层使用的是Redis做的数据持久化。 This project is a stude

liuyunfei 7 Nov 15, 2022