Python SDK for working with Voicegain Speech-to-Text

Last update: Dec 14, 2022

Overview

Voicegain Speech-to-Text Python SDK

Python SDK for the Voicegain Speech-to-Text API.

This API allows for large vocabulary speech-to-text transcription as well as grammar-based speech recognition. Both real-time and offline use cases are supported.

You can see the core Voicegain API documentation here.

The complete documentation for the API covered by this SDK is available here - this link requires an account on the Voicegain portal - see below for how to sign up.

Requirements

In order to use this API you need account with Voicegain. You can create an account by signing up on Voicegain Portal. No credit card required to sign up.

You can see pricing here - basically, it is 1 cent a minute for off-line and 1.25 cents a minute for real-time. There is a Free Tier of 600 minutes that renews each month.

Installation

From PyPI directly:

pip install voicegain-speech

Examples

sync_transcribe example:

configuration:

" configuration = Configuration() configuration.access_token = JWT api_client = ApiClient(configuration=configuration) ">

from voicegain_speech import ApiClient
from voicegain_speech import Configuration
from voicegain_speech import TranscribeApi
import base64


# configure your JWT token
JWT = "Your 
   
    "
   

configuration = Configuration()
configuration.access_token = JWT

api_client = ApiClient(configuration=configuration)

transcribe local file:

transcribe_api = TranscribeApi(api_client)
file_path = "Your local file path"

with open(file_path, "rb") as f:
    audio_base64 = base64.b64encode(f.read()).decode()

response = transcribe_api.asr_transcribe_post(
    sync_transcription_request={
        "audio": {
            "source": {
                "inline": {
                    "data": audio_base64
                }
            }
        }
    }
)

alternatives = response.result.alternatives
if alternatives:
    local_result = alternatives[0].utterance
    print("result from file: ", local_result)

else:
    local_result = None
    print("no transcription")

More examples can be found in examples folder on our GitHub

Learn more about Voicegain Platform at www.voicegain.ai

In this repository, I have developed an end to end Automatic speech recognition project. I have developed the neural network model for automatic speech recognition with PyTorch and used MLflow to manage the ML lifecycle, including experimentation, reproducibility, deployment, and a central model registry.

End to End Automatic Speech Recognition In this repository, I have developed an end to end Automatic speech recognition project. I have developed the

22 Nov 13, 2022

Speech Recognition for Uyghur using Speech transformer

Speech Recognition for Uyghur using Speech transformer Training: this model using CTC loss and Cross Entropy loss for training. Download pretrained mo

11 Nov 17, 2022

Text-Summarization-using-NLP - Text Summarization using NLP to fetch BBC News Article and summarize its text and also it includes custom article Summarization

Text-Summarization-using-NLP Text Summarization using NLP to fetch BBC News Arti

21 Aug 6, 2022

easySpeech is an open-source Python wrapper for google speech to text API that doesn't require PyAudio(So you especially windows user don't have to deal with the errors while installing PyAudio) and also works with hugging face transformers

easySpeech easySpeech is an open source python wrapper for google speech to text api that doesn't require PyAaudio(So you specially windows user don't

14 May 24, 2022

Text to speech converter with GUI made in Python.

Python SDK for working with Voicegain Speech-to-Text

Related tags

Overview

Voicegain Speech-to-Text Python SDK

Requirements

Installation

Examples

You might also like...

Speech Recognition for Uyghur using Speech transformer

Text-Summarization-using-NLP - Text Summarization using NLP to fetch BBC News Article and summarize its text and also it includes custom article Summarization

easySpeech is an open-source Python wrapper for google speech to text API that doesn't require PyAudio(So you especially windows user don't have to deal with the errors while installing PyAudio) and also works with hugging face transformers

Text to speech converter with GUI made in Python.

A relatively simple python program to generate one of those reddit text to speech videos dominating youtube.

This is a really simple text-to-speech app made with python and tkinter.

ttslearn: Library for Pythonで学ぶ音声合成 (Text-to-speech with Python)

A Python wrapper for simple offline real-time dictation (speech-to-text) and speaker-recognition using Vosk.

Releases(1.73.0)

1.73.0(Jan 6, 2023)

1.72.0(Dec 15, 2022)

1.71.1(Dec 9, 2022)

1.71.0(Dec 8, 2022)

1.70.2(Nov 23, 2022)

1.70.1(Nov 22, 2022)

1.70.0(Nov 22, 2022)

1.69.0(Nov 17, 2022)

1.68.1(Nov 11, 2022)

1.68.0(Oct 28, 2022)

1.67.0(Oct 25, 2022)

1.66.1(Oct 21, 2022)

1.66.0(Oct 18, 2022)

1.65.0(Sep 27, 2022)

1.64.1(Sep 19, 2022)

1.64.0(Sep 15, 2022)

1.63.0(Sep 7, 2022)

1.62.1(Aug 30, 2022)

1.62.0(Aug 26, 2022)

1.61.0(Aug 18, 2022)

1.60.4(Aug 11, 2022)

1.60.3(Jul 6, 2022)

1.60.2(Jun 30, 2022)

1.60.1(Jun 22, 2022)

1.60.0(Jun 17, 2022)

1.59.2(Jun 15, 2022)

1.59.1(Jun 9, 2022)

1.59.0(Jun 1, 2022)

1.58.1(May 24, 2022)

1.58.0(May 24, 2022)

Owner

Voicegain

German Text-To-Speech Engine using Tacotron and Griffin-Lim

Word Bot for JKLM Bomb Party

基于百度的语音识别，用python实现，pyaudio+pyqt

A simple Speech Emotion Recognition (SER) API created using Flask and running in a Docker container.

Free and Open Source Machine Translation API. 100% self-hosted, offline capable and easy to setup.

Examples of using sparse attention, as in "Generating Long Sequences with Sparse Transformers"

Scikit-learn style model finetuning for NLP

Jarvis is a simple Chatbot with a GUI capable of chatting and retrieving information and daily news from the internet for it's user.

A desktop GUI providing an audio interface for GPT3.

voice2json is a collection of command-line tools for offline speech/intent recognition on Linux

Creating a python chatbot that Starbucks users can text to place an order + help cut wait time of a normal coffee.

Natural Language Processing Best Practices & Examples

DeLighT: Very Deep and Light-Weight Transformers

Code for EMNLP20 paper: "ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training"

A model library for exploring state-of-the-art deep learning topologies and techniques for optimizing Natural Language Processing neural networks

The ibet-Prime security token management system for ibet network.

To create a deep learning model which can explain the content of an image in the form of speech through caption generation with attention mechanism on Flickr8K dataset.

Web Scraping, Document Deduplication & GPT-2 Fine-tuning with a newly created scam dataset.

A python framework to transform natural language questions to queries in a database query language.