OneShot Learning-based hotword detection.

Overview

EfficientWord-Net

Versions : 3.6 ,3.7,3.8,3.9

Hotword detection based on one-shot learning

Home assistants require special phrases called hotwords to get activated (eg:"ok google")

EfficientWord-Net is an hotword detection engine based on one-shot learning inspired from FaceNet's Siamese Network Architecture. Works very similar to face recognition , just requires a few samples of your own custom hotword to get going. No extra training or huge datasets required!! This will allow developers to add custom hotwords to their programs without a sweat or any extra charges. Just like google assistant's hotword detector, the engine performs the best when 3-4 hotword samples are collected directly from the user This repository is an official implemenation of EfficientWord-Net as a python library from the authors.

The library is purely written with python and uses Google's Tflite implemenation for faster realtime inference.

Demo of EfficientWord-Net in Pi

EfficientWord-Net.mp4

Access preprint

The research paper is currently under review in IEEE, click here to access the preprint and the training code will be available for public access once the paper is published.

Python Version Requirements

This Library works between python versions: 3.6 to 3.9

Dependencies Installation

Before running the pip installation command for the library, few dependencies need to be installed manually.

tflite package cannot be listed in requirements.txt hence will be automatically installed when the package is initialized in the system.

librosa package is not required for inference only cases , however when generate_reference is called , will be automatically installed.


Package Installation

Run the following pip command

pip install EfficientWord-Net

and to import running

import eff_word_net

Demo

After installing the packages, you can run the Demo script inbuilt with library (ensure you have a working mic).

Accesss Documentation from : https://ant-brain.github.io/EfficientWord-Net/

Command to run demo

python -m eff_word_net.engine

Generating Custom Wakewords

For any new hotword, the library needs information about the hotword, this information is obtained from a file called {wakeword}_ref.json. Eg: For the wakeword 'alexa', the library would need the file called alexa_ref.json

These files can be generated with the following procedure:

One needs to collect few 4 to 10 uniquely sounding pronunciations of a given wakeword. Then put them into a seperate folder, which doesnt contain anything else.

Finally run this command, it will ask for the input folder's location (containing the audio files) and the output folder (where _ref.json file will be stored).

python -m eff_word_net.generate_reference

The pathname of the generated wakeword needs to passed to the HotwordDetector detector instance.

HotwordDetector(
        hotword="hello",
        reference_file = "/full/path/name/of/hello_ref.json"),
        activation_count = 3 #2 by default
)

Few wakewords such as Mycroft, Google, Firefox, Alexa, Mobile, Siri the library has predefined embeddings readily available in the library installation directory, its path is readily available in the following variable

from eff_word_net import samples_loc

Try your first single hotword detection script

import os
from eff_word_net.streams import SimpleMicStream
from eff_word_net.engine import HotwordDetector
from eff_word_net import samples_loc

mycroft_hw = HotwordDetector(
        hotword="Mycroft",
        reference_file = os.path.join(samples_loc,"mycroft_ref.json"),
        activation_count=3
    )

mic_stream = SimpleMicStream()
mic_stream.start_stream()

print("Say Mycroft ")
while True :
    frame = mic_stream.getFrame()
    result = mycroft_hw.checkFrame(frame)
    if(result):
        print("Wakeword uttered")

Detecting Mulitple Hotwords from audio streams

The library provides a computation friendly way to detect multiple hotwords from a given stream, installed of running checkFrame() of each wakeword individually

import os
from eff_word_net.streams import SimpleMicStream
from eff_word_net import samples_loc
print(samples_loc)

alexa_hw = HotwordDetector(
        hotword="Alexa",
        reference_file = os.path.join(samples_loc,"alexa_ref.json"),
    )

siri_hw = HotwordDetector(
        hotword="Siri",
        reference_file = os.path.join(samples_loc,"siri_ref.json"),
    )

mycroft_hw = HotwordDetector(
        hotword="mycroft",
        reference_file = os.path.join(samples_loc,"mycroft_ref.json"),
        activation_count=3
    )

multi_hw_engine = MultiHotwordDetector(
        detector_collection = [
            alexa_hw,
            siri_hw,
            mycroft_hw,
        ],
    )

mic_stream = SimpleMicStream()
mic_stream.start_stream()

print("Say Mycroft / Alexa / Siri")

while True :
    frame = mic_stream.getFrame()
    result = multi_hw_engine.findBestMatch(frame)
    if(None not in result):
        print(result[0],f",Confidence {result[1]:0.4f}")

Access documentation of the library from here : https://ant-brain.github.io/EfficientWord-Net/

About activation_count in HotwordDetector

Documenatation with detailed explanation on the usage of activation_count parameter in HotwordDetector is in the making , For now understand that for long hotwords 3 is advisable and 2 for smaller hotwords. If the detector gives out multiple triggers for a single utterance, try increasing activation_count. To experiment begin with smaller values. Default value for the same is 2

FAQ :

  • Hotword Perfomance is bad : if you are having some issue like this , feel to ask the same in discussions

CONTRIBUTION:

  • If you have an ideas to make the project better, feel free to ping us in discussions
  • The current logmelcalc.tflite graph can convert only 1 audio frame to Log Mel Spectrogram at a time. It will be of a great help if tensorflow guru's outthere help us out with this.

TODO :

  • Add audio file handler in streams. PR's are welcome.
  • Remove librosa requirement to encourage generating reference files directly in edge devices
  • Add more detailed documentation explaining slider window concept

SUPPORT US:

Our hotword detector's performance is notably low when compared to Porcupine. We have thought about better NN architectures for the engine and hope to outperform Porcupine. This has been our undergrad project. Hence your support and encouragement will motivate us to develop the engine. If you loved this project recommend this to your peers, give us a 🌟 in Github and a clap 👏 in medium.

LICENCSE : Apache License 2.0

Comments
  • Threshold value in engine.py not working?

    Threshold value in engine.py not working?

    hello,

    first of all, thank you for this great library!

    I managed to make it work on my M1 MacBook Air, and trying out my personal hotword detection, but the threshold value does not seem to be working on my environment.

    In engine.py:

        def __init__(
                self,
                hotword:str,
                reference_file:str,
                threshold:float=0.995,
                activation_count=2,
                continuous=True,
                verbose = False):
    

    And this is my script:

    import os
    from eff_word_net.streams import SimpleMicStream
    from eff_word_net.engine import HotwordDetector
    from eff_word_net import samples_loc
    
    hotword_hw = HotwordDetector(
            hotword="hotword",
            reference_file = "hotword_ref.json",
            activation_count=3
        )
    
    mic_stream = SimpleMicStream()
    mic_stream.start_stream()
    
    print("Say hotword ")
    while True :
        frame = mic_stream.getFrame()
        result = hotword_hw.checkFrame(frame)
            print("Wakeword uttered")
            print(hotword_hw.getMatchScoreFrame(frame))
    

    and when I run this script, the checkFrame returns true even when the getMatchScoreFrame returns under the threshold, like:

    Wakeword uttered
    0.9371609374494279
    Wakeword uttered
    0.9164050520717595
    Wakeword uttered
    0.9082509350226378
    ...
    

    Could you please take a look at this?

    Thank you!

    opened by dominickchen 10
  • Hotword detection triggers the moment any sound is being playd, even with the default models

    Hotword detection triggers the moment any sound is being playd, even with the default models

    So I've been trying to make a custom hotwork. But after seeing it trigger all the time, the moment any kind of sound is being recorded, I decided to use a default one, like "brightness", "mobile", "google", etc.

    They all trigger immediatley. Using the default values for the HotWordDetector, by the way. Any clue why? It seemed to have worked great in your video presentation.

    Not using a cheap ass microphone by the way.

    opened by TrackLab 9
  • circuit diagram

    circuit diagram

    Good evening!Can you send me the circuit diagram of raspberry pie connecting the bread board and lighting the LED light? Your experiment is so interesting that I want to repeat it.

    documentation 
    opened by preachwebsite 5
  • Invalid sample rate

    Invalid sample rate

    ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'cards.bcm2835_headpho.pcm.front.0:CARD=0' ALSA lib conf.c:4743:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory ALSA lib conf.c:5231:(snd_config_expand) Evaluate error: No such file or directory ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM front ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.rear ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.center_lfe ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.side ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'cards.bcm2835_headpho.pcm.surround51.0:CARD=0' ALSA lib conf.c:4743:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory ALSA lib conf.c:5231:(snd_config_expand) Evaluate error: No such file or directory ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM surround21 ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'cards.bcm2835_headpho.pcm.surround51.0:CARD=0' ALSA lib conf.c:4743:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory ALSA lib conf.c:5231:(snd_config_expand) Evaluate error: No such file or directory ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM surround21 ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'cards.bcm2835_headpho.pcm.surround40.0:CARD=0' ALSA lib conf.c:4743:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory ALSA lib conf.c:5231:(snd_config_expand) Evaluate error: No such file or directory ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM surround40 ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'cards.bcm2835_headpho.pcm.surround51.0:CARD=0' ALSA lib conf.c:4743:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory ALSA lib conf.c:5231:(snd_config_expand) Evaluate error: No such file or directory ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM surround41 ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'cards.bcm2835_headpho.pcm.surround51.0:CARD=0' ALSA lib conf.c:4743:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory ALSA lib conf.c:5231:(snd_config_expand) Evaluate error: No such file or directory ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM surround50 ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'cards.bcm2835_headpho.pcm.surround51.0:CARD=0' ALSA lib conf.c:4743:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory ALSA lib conf.c:5231:(snd_config_expand) Evaluate error: No such file or directory ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM surround51 ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'cards.bcm2835_headpho.pcm.surround71.0:CARD=0' ALSA lib conf.c:4743:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory ALSA lib conf.c:5231:(snd_config_expand) Evaluate error: No such file or directory ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM surround71 ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'cards.bcm2835_headpho.pcm.iec958.0:CARD=0,AES0=4,AES1=130,AES2=0,AES3=2' ALSA lib conf.c:4743:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory ALSA lib conf.c:5231:(snd_config_expand) Evaluate error: No such file or directory ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM iec958 ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'cards.bcm2835_headpho.pcm.iec958.0:CARD=0,AES0=4,AES1=130,AES2=0,AES3=2' ALSA lib conf.c:4743:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory ALSA lib conf.c:5231:(snd_config_expand) Evaluate error: No such file or directory ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM spdif ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'cards.bcm2835_headpho.pcm.iec958.0:CARD=0,AES0=4,AES1=130,AES2=0,AES3=2' ALSA lib conf.c:4743:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory ALSA lib conf.c:5231:(snd_config_expand) Evaluate error: No such file or directory ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM spdif ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.hdmi ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.hdmi ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.modem ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.modem ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.phoneline ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.phoneline ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_rate_lavrate.so (libasound_module_rate_lavrate.so: libasound_module_rate_lavrate.so: cannot open shared object file: No such file or directory) ALSA lib pcm_rate.c:1468:(snd_pcm_rate_open) Cannot find rate converter ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_rate_samplerate.so (libasound_module_rate_samplerate.so: libasound_module_rate_samplerate.so: cannot open shared object file: No such file or directory) ALSA lib pcm_rate.c:1468:(snd_pcm_rate_open) Cannot find rate converter ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_rate_speexrate.so (libasound_module_rate_speexrate.so: libasound_module_rate_speexrate.so: cannot open shared object file: No such file or directory) ALSA lib pcm_rate.c:1468:(snd_pcm_rate_open) Cannot find rate converter ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_jack.so (libasound_module_pcm_jack.so: libasound_module_pcm_jack.so: cannot open shared object file: No such file or directory) ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_jack.so (libasound_module_pcm_jack.so: libasound_module_pcm_jack.so: cannot open shared object file: No such file or directory) ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_oss.so (libasound_module_pcm_oss.so: libasound_module_pcm_oss.so: cannot open shared object file: No such file or directory) ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_oss.so (libasound_module_pcm_oss.so: libasound_module_pcm_oss.so: cannot open shared object file: No such file or directory) ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_pulse.so (libasound_module_pcm_pulse.so: libasound_module_pcm_pulse.so: cannot open shared object file: No such file or directory) ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_pulse.so (libasound_module_pcm_pulse.so: libasound_module_pcm_pulse.so: cannot open shared object file: No such file or directory) ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_a52.so (libasound_module_pcm_a52.so: libasound_module_pcm_a52.so: cannot open shared object file: No such file or directory) ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_a52.so (libasound_module_pcm_a52.so: libasound_module_pcm_a52.so: cannot open shared object file: No such file or directory) ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_upmix.so (libasound_module_pcm_upmix.so: libasound_module_pcm_upmix.so: cannot open shared object file: No such file or directory) ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_upmix.so (libasound_module_pcm_upmix.so: libasound_module_pcm_upmix.so: cannot open shared object file: No such file or directory) ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_vdownmix.so (libasound_module_pcm_vdownmix.so: libasound_module_pcm_vdownmix.so: cannot open shared object file: No such file or directory) ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_vdownmix.so (libasound_module_pcm_vdownmix.so: libasound_module_pcm_vdownmix.so: cannot open shared object file: No such file or directory) ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_usb_stream.so (libasound_module_pcm_usb_stream.so: libasound_module_pcm_usb_stream.so: cannot open shared object file: No such file or directory) ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_usb_stream.so (libasound_module_pcm_usb_stream.so: libasound_module_pcm_usb_stream.so: cannot open shared object file: No such file or directory) Expression 'paInvalidSampleRate' failed in 'src/hostapi/alsa/pa_linux_alsa.c', line: 2043 Expression 'PaAlsaStreamComponent_InitialConfigure( &self->capture, inParams, self->primeBuffers, hwParamsCapture, &realSr )' failed in 'src/hostapi/alsa/pa_linux_alsa.c', line: 2713 Expression 'PaAlsaStream_Configure( stream, inputParameters, outputParameters, sampleRate, framesPerBuffer, &inputLatency, &outputLatency, &hostBufferSizeMode )' failed in 'src/hostapi/alsa/pa_linux_alsa.c', line: 2837 Traceback (most recent call last): File "/home/pi/Documents/test.py", line 11, in mic_stream = SimpleMicStream() File "/home/pi/Documents/eff_word_net/streams.py", line 71, in init mic_stream=p.open( File "/home/pi/Downloads/yes/envs/eff/lib/python3.8/site-packages/pyaudio.py", line 750, in open stream = Stream(self, *args, **kwargs) File "/home/pi/Downloads/yes/envs/eff/lib/python3.8/site-packages/pyaudio.py", line 441, in init self._stream = pa.open(**arguments) OSError: [Errno -9997] Invalid sample rate

    Good evening!I have encountered this problem. How can I solve it?Looking forward to your reply.

    raspberrypi 
    opened by preachwebsite 3
  • Could you help me?

    Could you help me?

    I won't deploy its running environment. Can you control it remotely? I have TeamViewer, a remote control software. The ID is 621 081 831. Or use other remote control. We look forward to your help.

    opened by preachwebsite 3
  • raising precision of custom wakeword

    raising precision of custom wakeword

    I'm curious whether the precision of custom wakeword improves if you provide more sound files, e.g. 50 files from different people? or is that meaningless?

    We want to use a custom wakeword for a public interaction system, and want it to recognize voice input from a wide range of people (young&old, male&female, etc).

    Thank you for letting me know.

    opened by dominickchen 3
  • Invalid input device (no default output device)

    Invalid input device (no default output device)

    ALSA lib conf.c:3723:(snd_config_hooks_call) Cannot open shared library libasound_module_conf_pulse.so (libasound_module_conf_pulse.so: libasound_module_conf_pulse.so: cannot open shared object file: No such file or directory) ALSA lib control.c:1379:(snd_ctl_open_noupdate) Invalid CTL hw:0 ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'cards.bcm2835_headpho.pcm.front.0:CARD=0' ALSA lib conf.c:4743:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory ALSA lib conf.c:5231:(snd_config_expand) Evaluate error: No such file or directory ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM front ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.rear ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.center_lfe ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.side ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround21 ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround21 ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround40 ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround41 ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround50 ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround51 ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.surround71 ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'cards.bcm2835_headpho.pcm.iec958.0:CARD=0,AES0=4,AES1=130,AES2=0,AES3=2' ALSA lib conf.c:4743:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory ALSA lib conf.c:5231:(snd_config_expand) Evaluate error: No such file or directory ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM iec958 ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'cards.bcm2835_headpho.pcm.iec958.0:CARD=0,AES0=4,AES1=130,AES2=0,AES3=2' ALSA lib conf.c:4743:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory ALSA lib conf.c:5231:(snd_config_expand) Evaluate error: No such file or directory ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM spdif ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'cards.bcm2835_headpho.pcm.iec958.0:CARD=0,AES0=4,AES1=130,AES2=0,AES3=2' ALSA lib conf.c:4743:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory ALSA lib conf.c:5231:(snd_config_expand) Evaluate error: No such file or directory ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM spdif ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.hdmi ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.hdmi ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.modem ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.modem ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.phoneline ALSA lib pcm.c:2660:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.phoneline ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_rate_lavrate.so (libasound_module_rate_lavrate.so: libasound_module_rate_lavrate.so: cannot open shared object file: No such file or directory) ALSA lib pcm_rate.c:1468:(snd_pcm_rate_open) Cannot find rate converter ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_rate_samplerate.so (libasound_module_rate_samplerate.so: libasound_module_rate_samplerate.so: cannot open shared object file: No such file or directory) ALSA lib pcm_rate.c:1468:(snd_pcm_rate_open) Cannot find rate converter ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_rate_speexrate.so (libasound_module_rate_speexrate.so: libasound_module_rate_speexrate.so: cannot open shared object file: No such file or directory) ALSA lib pcm_rate.c:1468:(snd_pcm_rate_open) Cannot find rate converter ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_jack.so (libasound_module_pcm_jack.so: libasound_module_pcm_jack.so: cannot open shared object file: No such file or directory) ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_jack.so (libasound_module_pcm_jack.so: libasound_module_pcm_jack.so: cannot open shared object file: No such file or directory) ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_oss.so (libasound_module_pcm_oss.so: libasound_module_pcm_oss.so: cannot open shared object file: No such file or directory) ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_oss.so (libasound_module_pcm_oss.so: libasound_module_pcm_oss.so: cannot open shared object file: No such file or directory) ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_pulse.so (libasound_module_pcm_pulse.so: libasound_module_pcm_pulse.so: cannot open shared object file: No such file or directory) ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_pulse.so (libasound_module_pcm_pulse.so: libasound_module_pcm_pulse.so: cannot open shared object file: No such file or directory) ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_a52.so (libasound_module_pcm_a52.so: libasound_module_pcm_a52.so: cannot open shared object file: No such file or directory) ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_a52.so (libasound_module_pcm_a52.so: libasound_module_pcm_a52.so: cannot open shared object file: No such file or directory) ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_upmix.so (libasound_module_pcm_upmix.so: libasound_module_pcm_upmix.so: cannot open shared object file: No such file or directory) ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_upmix.so (libasound_module_pcm_upmix.so: libasound_module_pcm_upmix.so: cannot open shared object file: No such file or directory) ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_vdownmix.so (libasound_module_pcm_vdownmix.so: libasound_module_pcm_vdownmix.so: cannot open shared object file: No such file or directory) ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_vdownmix.so (libasound_module_pcm_vdownmix.so: libasound_module_pcm_vdownmix.so: cannot open shared object file: No such file or directory) ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_usb_stream.so (libasound_module_pcm_usb_stream.so: libasound_module_pcm_usb_stream.so: cannot open shared object file: No such file or directory) ALSA lib dlmisc.c:340:(snd_dlobj_cache_get0) Cannot open shared library libasound_module_pcm_usb_stream.so (libasound_module_pcm_usb_stream.so: libasound_module_pcm_usb_stream.so: cannot open shared object file: No such file or directory) Traceback (most recent call last): File "/home/pi/Documents/test.py", line 11, in mic_stream = SimpleMicStream() File "/home/pi/Documents/eff_word_net/streams.py", line 71, in init mic_stream=p.open( File "/home/pi/Downloads/yes/envs/eff/lib/python3.8/site-packages/pyaudio.py", line 750, in open stream = Stream(self, *args, **kwargs) File "/home/pi/Downloads/yes/envs/eff/lib/python3.8/site-packages/pyaudio.py", line 441, in init self._stream = pa.open(**arguments) OSError: [Errno -9996] Invalid input device (no default output device)

    I have encountered this problem. How can I solve it?

    opened by preachwebsite 2
  • Here that working fine with ref file but not if a record custom file.

    Here that working fine with ref file but not if a record custom file.

    Hello, i working on google collab, so i don't have access to mic. The work around is to used mp3 or wav file. To do that i have add this class:

    from streams import CustomAudioStream
    from pydub import AudioSegment
    
    import numpy as np
    import wave
    
    RATE = 16000
    index = 0
    
    class SimpleFileStream(CustomAudioStream) :
    
        def open_stream(self, src, mp3):
            if mp3:
              dst = "Data/sample.wav"
              # convert mp3 to wav              
              sound = AudioSegment.from_mp3(src).set_frame_rate(16000)
              sound.export(dst, format="wav")
              self.wf = wave.open(dst, 'rb')
            else:
              print("Not an mp3")
              self.wf = wave.open(src, 'rb')
              self.wf.rewind()
            print("Get params of wav file " + str(self.wf.getparams()))
    
        def close_stream(self):
            self.wf.close()
    
        def get_next_frame(self):
            global index
            print("Index ", index)
            index = index + self.CHUNK
            return np.frombuffer(self.wf.readframes(self.CHUNK),dtype=np.int16)
    
        """
        Implements stream with sliding window, 
        implemented by inheriting CustomAudioStream
        """
        def __init__(self,sliding_window_secs:float=1/8):
            self.CHUNK = int(sliding_window_secs*RATE)
    
            CustomAudioStream.__init__(
                self,
                open_stream = self.open_stream,
                close_stream = self.close_stream,
                get_next_frame = self.get_next_frame,
            )
    

    It seems working if i used ref file of github. But if i record a custom file using audacity it is not detect the wakeword.

    If i change the threshold to 0.7 and the activation count to 2 it is work better, but il will increase the chance of getting false positive.

    Is it mandatory to have custom ref for each user ?

    Best regards Sebastien

    opened by warichet 2
  • Bump numpy from 1.20.0 to 1.22.0

    Bump numpy from 1.20.0 to 1.22.0

    Bumps numpy from 1.20.0 to 1.22.0.

    Release notes

    Sourced from numpy's releases.

    v1.22.0

    NumPy 1.22.0 Release Notes

    NumPy 1.22.0 is a big release featuring the work of 153 contributors spread over 609 pull requests. There have been many improvements, highlights are:

    • Annotations of the main namespace are essentially complete. Upstream is a moving target, so there will likely be further improvements, but the major work is done. This is probably the most user visible enhancement in this release.
    • A preliminary version of the proposed Array-API is provided. This is a step in creating a standard collection of functions that can be used across application such as CuPy and JAX.
    • NumPy now has a DLPack backend. DLPack provides a common interchange format for array (tensor) data.
    • New methods for quantile, percentile, and related functions. The new methods provide a complete set of the methods commonly found in the literature.
    • A new configurable allocator for use by downstream projects.

    These are in addition to the ongoing work to provide SIMD support for commonly used functions, improvements to F2PY, and better documentation.

    The Python versions supported in this release are 3.8-3.10, Python 3.7 has been dropped. Note that 32 bit wheels are only provided for Python 3.8 and 3.9 on Windows, all other wheels are 64 bits on account of Ubuntu, Fedora, and other Linux distributions dropping 32 bit support. All 64 bit wheels are also linked with 64 bit integer OpenBLAS, which should fix the occasional problems encountered by folks using truly huge arrays.

    Expired deprecations

    Deprecated numeric style dtype strings have been removed

    Using the strings "Bytes0", "Datetime64", "Str0", "Uint32", and "Uint64" as a dtype will now raise a TypeError.

    (gh-19539)

    Expired deprecations for loads, ndfromtxt, and mafromtxt in npyio

    numpy.loads was deprecated in v1.15, with the recommendation that users use pickle.loads instead. ndfromtxt and mafromtxt were both deprecated in v1.17 - users should use numpy.genfromtxt instead with the appropriate value for the usemask parameter.

    (gh-19615)

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • Discussion : Hotword's accuracy too low

    Discussion : Hotword's accuracy too low

    If you are playing around with using your own custom hotwords and some hotword happen to not work so good Feel free to use the thread in discussions https://github.com/Ant-Brain/EfficientWord-Net/discussions/4

    opened by TheSeriousProgrammer 0
  • Hotword matches without any utterance

    Hotword matches without any utterance

    Hi, first of all thanks for making this library, Its fantastic!! I understand well since its in very early phase so it will have some issue and eventually it will better. So this time I was trying to go with the given example of hotword detector, I tried to attach a speech recognition after hotword triggers, but the performance is quite messy , to demonstrate this I am including this gif.

    Code_nyP9cJMBBg

    Problem1: Basically what happening is I am trying to call speech recognition right after there is match, as the speech recognition ends it again shows hotword uttered and re listen, even though there no hotword uttered and with confidence.

    Problem2: Also in some situations it matches when there is little click or desk sound.

    any fix for at least for Problem 1 I see problem 2 could be the reason of weak training as depending upon the hotword.

    opened by OnlinePage 2
  • OSError: [Errno -9981] Input overflowed

    OSError: [Errno -9981] Input overflowed

    I've installed the python library with

    pip install EfficientWord-Net

    onto a raspberry pi 2 with recent raspbian lite.

    However if i run the demo with python -m eff_word_net.engine i'll get the following error and nothing works:

    Say Mycroft / Alexa / Siri
    Traceback (most recent call last):
      File "/usr/lib/python3.9/runpy.py", line 197, in _run_module_as_main
        return _run_code(code, main_globals, None,
      File "/usr/lib/python3.9/runpy.py", line 87, in _run_code
        exec(code, run_globals)
      File "/home/max/.local/lib/python3.9/site-packages/eff_word_net/engine.py", line 333, in <module>
        frame = mic_stream.getFrame()
      File "/home/max/.local/lib/python3.9/site-packages/eff_word_net/streams.py", line 49, in getFrame
        new_frame = self._get_next_frame()
      File "/home/max/.local/lib/python3.9/site-packages/eff_word_net/streams.py", line 85, in <lambda>
        np.frombuffer(mic_stream.read(CHUNK),dtype=np.int16)
      File "/usr/lib/python3/dist-packages/pyaudio.py", line 608, in read
        return pa.read_stream(self._stream, num_frames, exception_on_overflow)
    OSError: [Errno -9981] Input overflowed
    

    any idea?

    opened by mKenfenheuer 1
  • complex hotwords support #Current Model Limitations Discussion

    complex hotwords support #Current Model Limitations Discussion

    Hi, Thanks for your helpful research. I wonder if the current model can handle complex hot words like "Hey Siri" or just handle one word, like "Siri"?

    My second question is about hot words that their pronunciation takes more than 1s, like"Hey XXXX." Does your model support changing the recording time?

    Did you try to use cosine_similarity instead of Euclidian distance in inference time?

    Thanks.

    enhancement 
    opened by amoazeni75 7
  • Problem with Dependencies #Docker Support

    Problem with Dependencies #Docker Support

    Hello I left a comment on Reddit saying I would give it a go, and you said if I had a problem to log it here, so here I am, with a problem 😊

    I seem to get stuck with pip3 install librosa I get this error Failed building wheel for llvmlite Running setup.py clean for llvmlite Failed to build llvmlite I can push on and get EfficientWord installed and working, if I say Alexa it says Yup I hear ya

    The problem is then when I try to create my own wake word I run this command … python3 -m eff_word_net.generate_reference [email protected]:~ $ python3 -m eff_word_net.generate_reference Paste Path of folder Containing audio files:/home/pi/wakewords Paste Path of location to save *_ref.json :/home/pi/wakewords Enter Wakeword Name :bender Traceback (most recent call last): File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/usr/lib/python3.7/runpy.py", line 85, in run_code exec(code, run_globals) File "/home/pi/.local/lib/python3.7/site-packages/eff_word_net/generate_reference.py", line 80, in input("Enter Wakeword Name :") File "/home/pi/.local/lib/python3.7/site-packages/eff_word_net/generate_reference.py", line 47, in generate_reference_file x, = librosa.load(audio_file,sr=16000) AttributeError: module 'librosa' has no attribute 'load'

    My Problem is with librosa, I am not able to install it. I tried everything I could google but it will never install

    How did you get around this problem ?

    enhancement good first issue raspberrypi wake_word_generation 
    opened by Balro76 3
Releases(stable)
Neural Oblivious Decision Ensembles

Neural Oblivious Decision Ensembles A supplementary code for anonymous ICLR 2020 submission. What does it do? It learns deep ensembles of oblivious di

25 Sep 21, 2022
LF-YOLO (Lighter and Faster YOLO) is used to detect defect of X-ray weld image.

This project is based on ultralytics/yolov3. LF-YOLO (Lighter and Faster YOLO) is used to detect defect of X-ray weld image. Download $ git clone http

26 Dec 13, 2022
Official code repository of the paper Learning Associative Inference Using Fast Weight Memory by Schlag et al.

Learning Associative Inference Using Fast Weight Memory This repository contains the offical code for the paper Learning Associative Inference Using F

Imanol Schlag 18 Oct 12, 2022
Malware Env for OpenAI Gym

Malware Env for OpenAI Gym Citing If you use this code in a publication please cite the following paper: Hyrum S. Anderson, Anant Kharkar, Bobby Fila

ENDGAME 563 Dec 29, 2022
Allele-specific pipeline for unbiased read mapping(WIP), QTL discovery(WIP), and allelic-imbalance analysis

WASP2 (Currently in pre-development): Allele-specific pipeline for unbiased read mapping(WIP), QTL discovery(WIP), and allelic-imbalance analysis Requ

McVicker Lab 2 Aug 11, 2022
FTIR-Deep Learning - FTIR Deep Learning With Python

CANDIY-spectrum Human analyis of chemical spectra such as Mass Spectra (MS), Inf

Wei Mei 1 Jan 03, 2022
Style transfer between images was performed using the VGG19 model

Style transfer between images was performed using the VGG19 model. The necessary codes, libraries and all other information of this project are available below

Onur yılmaz 2 May 09, 2022
Neural-net-from-scratch - A simple Neural Network from scratch in Python using the Pymathrix library

A Simple Neural Network from scratch A Simple Neural Network from scratch in Pyt

Youssef Chafiqui 2 Jan 07, 2022
Omniverse sample scripts - A guide for developing with Python scripts on NVIDIA Ominverse

Omniverse sample scripts ここでは、NVIDIA Omniverse ( https://www.nvidia.com/ja-jp/om

ft-lab (Yutaka Yoshisaka) 37 Nov 17, 2022
Improving Factual Completeness and Consistency of Image-to-text Radiology Report Generation

Improving Factual Completeness and Consistency of Image-to-text Radiology Report Generation The reference code of Improving Factual Completeness and C

46 Dec 15, 2022
Code for WSDM 2022 paper, Contrastive Learning for Representation Degeneration Problem in Sequential Recommendation.

DuoRec Code for WSDM 2022 paper, Contrastive Learning for Representation Degeneration Problem in Sequential Recommendation. Usage Download datasets fr

Qrh 46 Dec 19, 2022
Empower Sequence Labeling with Task-Aware Language Model

LM-LSTM-CRF Check Our New NER Toolkit 🚀 🚀 🚀 Inference: LightNER: inference w. models pre-trained / trained w. any following tools, efficiently. Tra

Liyuan Liu 838 Jan 05, 2023
Teaches a student network from the knowledge obtained via training of a larger teacher network

Distilling-the-knowledge-in-neural-network Teaches a student network from the knowledge obtained via training of a larger teacher network This is an i

Abhishek Sinha 146 Dec 11, 2022
Hub is a dataset format with a simple API for creating, storing, and collaborating on AI datasets of any size.

Hub is a dataset format with a simple API for creating, storing, and collaborating on AI datasets of any size. The hub data layout enables rapid transformations and streaming of data while training m

Activeloop 5.1k Jan 08, 2023
A transformer which can randomly augment VOC format dataset (both image and bbox) online.

VocAug It is difficult to find a script which can augment VOC-format dataset, especially the bbox. Or find a script needs complex requirements so it i

Coder.AN 1 Mar 05, 2022
The mini-AlphaStar (mini-AS, or mAS) - mini-scale version (non-official) of the AlphaStar (AS)

A mini-scale reproduction code of the AlphaStar program. Note: the original AlphaStar is the AI proposed by DeepMind to play StarCraft II.

Ruo-Ze Liu 216 Jan 04, 2023
Transformer based SAR image despeckling

Transformer based SAR image despeckling Using the code: The code is stable while using Python 3.6.13, CUDA =10.1 Clone this repository: git clone htt

27 Nov 13, 2022
Implements an infinite sum of poisson-weighted convolutions

An infinite sum of Poisson-weighted convolutions Kyle Cranmer, Aug 2018 If viewing on GitHub, this looks better with nbviewer: click here Consider a v

Kyle Cranmer 26 Dec 07, 2022
Implementation of "A Deep Learning Loss Function based on Auditory Power Compression for Speech Enhancement" by pytorch

This repository is used to suspend the results of our paper "A Deep Learning Loss Function based on Auditory Power Compression for Speech Enhancement"

ScorpioMiku 19 Sep 30, 2022
Instance-conditional Knowledge Distillation for Object Detection

Instance-conditional Knowledge Distillation for Object Detection This is a MegEngine implementation of the paper "Instance-conditional Knowledge Disti

MEGVII Research 47 Nov 17, 2022