wenet-kws

Production First and Production Ready End-to-End Keyword Spotting Toolkit.

The goal of this toolkit it to...

Small footprint keyword spotting (KWS), or specifically wake-up word (WuW) detection is a typical and important module in internet of things (IoT) devices. It provides a way for users to control IoT devices with a hands-free experience. A WuW detection system usually runs locally and persistently on IoT devices, which requires low consumptional power, less model parameters, low computational comlexity and to detect predefined keyword in a streaming way, i.e., requires low latency.

Typical Scenario

We are going to support the following typical applications of wakeup word:

Single wake-up word
Multiple wake-up words
Customizable wake-up word
Personalized wake-up word, i.e. combination of wake-up word detection and voiceprint

Dataset

We plan to support a variaty of open source wake-up word datasets, include but not limited to:

All the well-trained models on these dataset will be made public avaliable.

Runtime

We plan to support a variaty of hardwares and platforms, including:

Web browser
x86
Android
Raspberry Pi

Production First and Production Ready End-to-End Keyword Spotting Toolkit

Related tags

Overview

wenet-kws

Typical Scenario

Dataset

Runtime

Owner

CoSENT、STS、SentenceBERT

A python framework to transform natural language questions to queries in a database query language.

Official PyTorch implementation of "Dual Path Learning for Domain Adaptation of Semantic Segmentation".

Textpipe: clean and extract metadata from text

中文医疗信息处理基准CBLUE: A Chinese Biomedical LanguageUnderstanding Evaluation Benchmark

An open-source NLP research library, built on PyTorch.

Finding Label and Model Errors in Perception Data With Learned Observation Assertions

A python project made to generate code using either OpenAI's codex or GPT-J (Although not as good as codex)

Implemented shortest-circuit disambiguation, maximum probability disambiguation, HMM-based lexical annotation and BiLSTM+CRF-based named entity recognition

NeoDays-based tileset for the roguelike CDDA (Cataclysm Dark Days Ahead)

Multilingual word vectors in 78 languages

Suite of 500 procedurally-generated NLP tasks to study language model adaptability

The official repository of the ISBI 2022 KNIGHT Challenge

This is a NLP based project to extract effective date of the contract from their text files.

Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.

Predict the spans of toxic posts that were responsible for the toxic label of the posts

Paradigm Shift in NLP - "Paradigm Shift in Natural Language Processing".

Sequence Modeling with Structured State Spaces

Weaviate demo with the text2vec-openai module

VMD Audio/Text control with natural language