[email protected] Reverb Database. | PythonRepo" /> [email protected] Reverb Database. | PythonRepo">

The purpose of this code base is to add a specified signal-to-noise ratio noise from MUSAN dataset to a pure speech signal and to generate far-field speech data using room impulse response data from BUT [email protected] Reverb Database.

Overview

Add_noise_and_rir_to_speech

The purpose of this code base is to add a specified signal-to-noise ratio noise from MUSAN dataset to a pure speech signal and to generate far-field speech data using room impulse response data from BUT [email protected] Reverb Database.

Noise and RIR dataset description:

  • BUT [email protected] Reverb Database:

    The database is being built with respect to collect a large number of various Room Impulse Responses, Room environmental noises (or "silences"), Retransmitted speech (for ASR and SID testing), and meta-data (positions of microphones, speakers etc.).

    The goal is to provide speech community with a dataset for data enhancement and distant microphone or microphone array experiments in ASR and SID.

    In this codebase, we only use the RIR data, which is used to synthesize far-field speech, the composition of the RIR dataset and citation details are as follows.

    Room Name Room Type Size (length, depth, height) (m) (microphone_num x loudspeaker_num)
    Q301 Office 10.7x6.9x2.6 31 x 3
    L207 Office 4.6x6.9x3.1 31 x 6
    L212 Office 7.5x4.6x3.1 31 x 5
    L227 Stairs 6.2x2.6x14.2 31 x 5
    R112 Hotel room 4.4x2.8x2.6 31 x 5
    CR2 Conference room 28.2x11.1x3.3 31 x 4
    E112 Lecture room 11.5x20.1x4.8 31 x 2
    D105 Lecture room 17.2x22.8x6.9 31 x 6
    C236 Meeting room 7.0x4.1x3.6 31 x 10
    @ARTICLE{8717722,
             author={Szöke, Igor and Skácel, Miroslav and Mošner, Ladislav and Paliesek, Jakub and Černocký, Jan},
             journal={IEEE Journal of Selected Topics in Signal Processing}, 
             title={Building and evaluation of a real room impulse response dataset}, 
             year={2019},
             volume={13},
             number={4},
             pages={863-876},
             doi={10.1109/JSTSP.2019.2917582}
     }
    
  • MUSAN database:

    The database consists of music from several genres, speech from twelve languages, and a wide assortment of technical and non-technical noises and we only use the noise data in this database. Citation details are as follows.

    @misc{snyder2015musan,
          title={MUSAN: A Music, Speech, and Noise Corpus}, 
          author={David Snyder and Guoguo Chen and Daniel Povey},
          year={2015},
          eprint={1510.08484},
          archivePrefix={arXiv},
          primaryClass={cs.SD}
    }
    

Before using the data-processing code:

  • If you do not want the original dataset to be overwritten, please download the dataset again for use

  • You need to create three files: 'training_list.txt', 'validation_list.txt', 'testing_list.txt', based on your training, validation and test data file paths respectively, and ensure the audio in the file paths can be read and written.

  • The content of the aforementioned '*_list.txt' files are in the following form:

    *_list.txt
    	/../...../*.wav
    	/../...../*.wav
    	/../...../*.wav
    

Instruction for using the following data-processing code:

  1. mix_cleanaudio_with_rir_offline.py: Generate far-field speech offline

    • two parameters are needed:

      • --data_root: the data path which you want to download and store the RIR dataset in.
      • --clean_data_list_path: the path of the folder in which 'training_list.txt', 'validation_list.txt', 'testing_list.txt' are stored in
    • 2 folders will be created in data_root: 'ReverDB_data (Removable if needed)', 'ReverDB_mix'

  2. download_and_extract_noise_file.py: Generate musan noise file

    • one parameters are needed:
      • --data_root: the data path which you want to download and store the noise dataset in.
    • 2 folder will be created in data_root: 'musan (Removable if needed)', 'noise'
  3. vad_torch.py: Voice activity detection when adding noise to the speech

    The noise data is usually added online according to the SNR requirements, several pieces of code are provided below, please add them in the appropriate places according to your needs!

    import torchaudio
    import numpy as np
    import torch
    import random
    from vad_torch import VoiceActivityDetector
    
    
    def _add_noise(speech_sig, vad_duration, noise_sig, snr):
        """add noise to the audio.
        :param speech_sig: The input audio signal (Tensor).
        :param vad_duration: The length of the human voice (int).
        :param noise_sig: The input noise signal (Tensor).
        :param snr: the SNR you want to add (int).
        :returns: noisy speech sig with specific snr.
        """
        if vad_duration != 0:
            snr = 10**(snr/10.0)
            speech_power = torch.sum(speech_sig**2)/vad_duration
            noise_power = torch.sum(noise_sig**2)/noise_sig.shape[1]
            noise_update = noise_sig / torch.sqrt(snr * noise_power/speech_power)
    
            if speech_sig.shape[1] > noise_update.shape[1]:
                # padding
                temp_wav = torch.zeros(1, speech_sig.shape[1])
                temp_wav[0, 0:noise_update.shape[1]] = noise_update
                noise_update = temp_wav
            else:
                # cutting
                noise_update = noise_update[0, 0:speech_sig.shape[1]]
    
            return noise_update + speech_sig
        
        else:
            return speech_sig
        
    def main():
        # loading speech file
        speech_file = './speech.wav'
    	waveform, sr = torchaudio.load(speech_file)
    	waveform = waveform - waveform.mean()
    	
        # loading noise file and set snr
    	snr = 0       
    	noise_file = random.randint(1, 930)
    	
        # Voice activity detection
    	v = VoiceActivityDetector(waveform, sr)
    	raw_detection = v.detect_speech()
    	speech_labels = v.convert_windows_to_readible_labels(raw_detection)
    	vad_duration = 0
        if not len(speech_labels) == 0:
            for i in range(len(speech_labels)):
                start = speech_labels[i]['speech_begin']
                end = speech_labels[i]['speech_end']
                vad_duration = vad_duration + end-start
                
    	# adding noise
        noise, _ = torchaudio.load('/notebooks/noise/' + str(noise_file) + '.wav')
        waveform = _add_noise(waveform, vad_duration, noise, snr)
    
    if __name__ == '__main__':
        main()
Owner
Yunqi Chen
3rd-year undergraduate student; Passionate about all kinds of sports and everything interesting!
Yunqi Chen
Lock a program and kills it indefinitely if it is started.

Kill By Lock Lock a program and kills it indefinitely if it is started. How start it? It' simple, you just have to double-click on the python file (.p

1 Jan 12, 2022
Hydralit package is a wrapping and template project to combine multiple independant Streamlit applications into a multi-page application.

Hydralit The Hydralit package is a wrapping and template project to combine multiple independant (or somewhat dependant) Streamlit applications into a

Jackson Storm 108 Jan 08, 2023
KeyLogger cliente-servidor em Python para estudos

KeyLogger Esse projeto é apenas para estudos, não nos responsabilisamos por qualquer uso indevido ou prejudiciais do mesmo. Sobre O objetivo do projet

1 Dec 17, 2021
Python Function to manage users via SCIM

Python Function to manage users via SCIM This script helps you to manage your v2 users. You can add and delete users or groups, add users to groups an

4 Oct 11, 2022
This Open-Source project is great for sensor capture and storage solutions.

Phase 1 This project helps developers in the creation of extended realities that communicate with Arduino and require the security of blockchain stora

Wolfberry, LLC 10 Dec 28, 2022
适用于HoshinoBot下的雀魂插件。可进行近期对局查询、查询个人数据等功能,更多功能正在扩展

Majsoul_bot This is a Majsoul plugin for HoshinoBot 这是一个HoshinoBot的雀魂相关插件 本项目目前正在扩展,后续会扩展更多功能,敬请期待 前言 项目地址:https://github.com/DaiShengSheng/Majsoul_bo

黛笙笙 33 Dec 14, 2022
Table (Finnish Taulukko) glued together to transform into hands-free living.

taulukko Table (Finnish Taulukko) glued together to transform into hands-free living. Installation Preferred way to install is as usual (for testing o

Stefan Hagen 2 Dec 14, 2022
Headless chatbot that detects spam and posts links to it to chatrooms for quick deletion.

SmokeDetector Headless chatbot that detects spam and posts it to chatrooms. Uses ChatExchange, takes questions from the Stack Exchange realtime tab, a

Charcoal 421 Dec 21, 2022
A basic DIY-project made using Python and MySQL

Banking-Using-Python-MySQL This is a basic DIY-project made using Python and MySQL. Pre-Requisite needed:-- MySQL command Line:- creating a database

ABHISHEK 0 Jul 03, 2022
Dump Data from FTDI Serial Port to Binary File on MacOS

Dump Data from FTDI Serial Port to Binary File on MacOS

pandy song 1 Nov 24, 2021
PyToQlik is a library that allows you to integrate Qlik Desktop with Jupyter notebooks

PyToQlik is a library that allows you to integrate Qlik Desktop with Jupyter notebooks. With it you can: Open and edit a Qlik app inside a Ju

BIX Tecnologia 16 Sep 09, 2022
Test pour savoir si je suis capable de paratger une lib avec le monde entier !!

Data analysis Document here the project: MLproject Description: Project Description Data Source: Type of analysis: Please document the project the bet

Lucas_Penarrubia 0 Jan 18, 2022
Implemented Exploratory Data Analysis (EDA) using Python.Built a dashboard in Tableau and found that 45.87% of People suffer from heart disease.

Heart_Disease_Diagnostic_Analysis Objective 🎯 The aim of this project is to use the given data and perform ETL and data analysis to infer key metrics

Sultan Shaikh 4 Jan 28, 2022
An educational platform for students

Watch N Learn About Watch N Learn is an educational platform for students. Watch N Learn incentivizes students to learn with fun activities and reward

Brian Law 3 May 04, 2022
Web站点选优工具 - 优化GitHub的打开速度、高效Clone

QWebSiteOptimizer - Web站点速度选优工具 在访问GitHub等网站时,DNS解析到的IP地址可能并不是最快,过慢的节点会严重影响我们的访问情况,故制作出这样的工具来进一步优化网络质量。 由于该方案并非为VPN等方式进行的速度优化,以下几点需要您注意: 后续访问对应网站时仍可能需

QPT Family 15 May 01, 2022
Geodesic Dome Math

dome Geodesic Dome Math Python dome tool dome.py calculates an icosahedron or 2v geodesic dome and creates 3d printable hubs as OpenSCAD sources. usag

Brian Olson 2 Feb 09, 2022
Comprehensive OpenAPI schema generator for Django based on pydantic

🗡️ Djagger Automated OpenAPI documentation generator for Django. Djagger helps you generate a complete and comprehensive API documentation of your Dj

13 Nov 26, 2022
This library is an abstraction for Splunk-related development, maintenance, or migration operations

This library is an abstraction for Splunk-related development, maintenance, or migration operations. It provides a single CLI or SDK to conveniently perform various operations such as managing a loca

NEXTPART 6 Dec 21, 2022
A library for pattern matching on symbolic expressions in Python.

MatchPy is a library for pattern matching on symbolic expressions in Python. Work in progress Installation MatchPy is available via PyPI, and

High-Performance and Automatic Computing 151 Dec 24, 2022
A compiler for ARM, X86, MSP430, xtensa and more implemented in pure Python

Introduction The PPCI (Pure Python Compiler Infrastructure) project is a compiler written entirely in the Python programming language. It contains fro

Windel Bouwman 277 Dec 26, 2022