Analysis of voices based on the Mel-frequency band

Overview

Speaker_partition_module

Analysis of voices based on the Mel-frequency band.
Goal: Identification of voices speaking (diarization) and calculation of speech partition (in %).

Methodology:

  • Collect voice data
  • Sample audio data of x speakers that talk y times to represent a round of people talking
  • Annotate samples with labels and merge audio file
  • Create train & test split of samples
  • Train unsupervised clustering module to detect number of people
  • Train supervised RNN classifier to determine who is speaking at time x

Preprocessing

  • Convert files to .wav convertFlac2Wav.py
  • Collect data via LibriSpeech voices library (audiofiles) audio_manipulation02.py
  • Extract x random speakers with y audio samples per speaker Result: Generated audio samples of length 30-60 seconds

Feature extraction:

  • Create mel-frequency spectrum for each audio file feature_extraction.py
  • Define overlapping feature window for training

Training:

  • Implementation of google-diarizer module
  • Training accuracy is only at 40 %

Further activity

  • Create own unsupervised clustering module
  • Try out different libraries
A Python library for audio data augmentation. Inspired by albumentations. Useful for machine learning.

Audiomentations A Python library for audio data augmentation. Inspired by albumentations. Useful for deep learning. Runs on CPU. Supports mono audio a

Iver Jordal 1.2k Jan 07, 2023
Royal Music You can play music and video at a time in vc

Royals-Music Royal Music You can play music and video at a time in vc Commands SOON String STRING_SESSION Deployment ๐ŸŽ– Credits โ€ข ๐Ÿ‡ธแดแดสแด€โƒ๐Ÿ‡ฏแด‡แด‡แด› โ€ข ๐Ÿ‡ดา“า“ษช

2 Nov 23, 2021
A python script that can play .mp3 URLs upon the ringing or motion detection of a Ring doorbell. The sound plays through Sonos speakers.

Ring x Sonos A python script that plays .mp3 files whenever a doorbell is rung or a doorbell detects motion. Features Music! Authors @braden Running T

braden 0 Nov 12, 2021
Music player - endlessly plays your music

Music player First, if you wonder about what is supposed to be a music player or what makes a music player different from a simple media player, read

Albert Zeyer 482 Dec 19, 2022
A telegram bot for which is help to play songs in vc ๐Ÿฅฐ give ๐ŸŒŸ and fork this repo before use ๐Ÿ˜

TamilVcMusic ๐ŸŒŸ TamilVCMusicBot ๐ŸŒŸ Give your ๐Ÿ’™ Before clicking on deploy to heroku just click on fork and star just below How to deploy Click the bel

TamilBots 150 Dec 13, 2022
ianZiPu is a way to write notation for Guqin (ๅค็ด) music.

PyBetween Wrapper for Between - ๋น„ํŠธ์œˆ์„ ์œ„ํ•œ ํŒŒ์ด์ฌ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ Legal Disclaimer ์˜ค์ง ๊ต์œก์  ๋ชฉ์ ์œผ๋กœ๋งŒ ์‚ฌ์šฉํ• ์ˆ˜ ์žˆ์œผ๋ฉฐ, ๋น„ํŠธ์œˆ์€ VCNC์˜ ์ž์‚ฐ์ž…๋‹ˆ๋‹ค. ์•…์˜์  ๊ณต๊ฒฉ์— ์ด์šฉํ• ์‹œ ์ฒ˜๋ฒŒ ๋ฐ›์„์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์‚ฌ์šฉ์— ๋”ฐ๋ฅธ ์ฑ…์ž„์€ ์‚ฌ์šฉ์ž๊ฐ€

Nancy Yi Liang 8 Nov 25, 2022
Sync Toolbox - Python package with reference implementations for efficient, robust, and accurate music synchronization based on dynamic time warping (DTW)

Sync Toolbox - Python package with reference implementations for efficient, robust, and accurate music synchronization based on dynamic time warping (DTW)

Meinard Mueller 66 Jan 02, 2023
BART aids transcribe tasks by taking a source audio file and creating automatic repeated loops, allowing transcribers to listen to fragments multiple times

BART (Beyond Audio Replay Technology) aids transcribe tasks by taking a source audio file and creating automatic repeated loops, allowing transcribers to listen to fragments multiple times (with poss

2 Feb 04, 2022
Voice package for Pycord adding extra features.

VoiceIO Voice package for Pycord adding extra features. Example Down bellow is an example of what you can currently do. import voiceio process = voic

pycord 1 Dec 24, 2021
GiantMIDI-Piano is a classical piano MIDI dataset contains 10,854 MIDI files of 2,786 composers

GiantMIDI-Piano is a classical piano MIDI dataset contains 10,854 MIDI files of 2,786 composers

Bytedance Inc. 1.3k Jan 04, 2023
๐ŸŽต Python sound notifications made easy

chime Python sound notifications made easy. Table of contents Table of contents Motivation Installation Basic usage Theming IPython/Jupyter magic Exce

Max Halford 231 Jan 09, 2023
Python library for handling audio datasets.

AUDIOMATE Audiomate is a library for easy access to audio datasets. It provides the datastructures for accessing/loading different datasets in a gener

Matthias 121 Nov 27, 2022
Welcome to Nexus. Your personal virtual assistant

AI Voice Assistant Welcome to Nexus voice assistant Description Have you ever heard of voice assistants like Cortana, Siri, Google assistant, and Alex

Mustafah Zacs 1 Jan 10, 2022
A bot that can play music on Telegram Group and Channel Voice Chats

DaisyXmusic โค is the best and only Telegram VC player with playlists, Multi Playback, Channel play and more

TeamOfDaisyX 20 Jun 11, 2021
๐™ฐ ๐™ผ๐šž๐šœ๐š’๐šŒ ๐™ฑ๐š˜๐š ๐™ฒ๐š›๐šŽ๐šŠ๐š๐šŽ๐š ๐™ฑ๐šข ๐šƒ๐šŽ๐šŠ๐š–๐™ณ๐š•๐š ๐Ÿ’–

TeamDltmusic ๐™ฐ ๐™ผ๐šž๐šœ๐š’๐šŒ ๐™ฑ๐š˜๐š ๐™ฒ๐š›๐šŽ๐šŠ๐š๐šŽ๐š ๐™ฑ๐šข ๐šƒ๐šŽ๐šŠ๐š–๐™ณ๐š•๐š ๐Ÿ’– Deploy String Session String Click hear you can find string session OR join He

TeamDlt 5 Jan 18, 2022
Users can transcribe their favorite piano recordings to MIDI files after installation

Users can transcribe their favorite piano recordings to MIDI files after installation

190 Dec 17, 2022
[Singing Log] Let your program learn to sing!

[Singing Log] Let your program learn to sing! You must have thought this was changelog when you saw the English title, but it's not, it's chร nggฤ“log. What it does is allow your program to print logs

้ป„ๅท 22 Sep 03, 2022
convert-to-opus-cli is a Python CLI program for converting audio files to opus audio format.

convert-to-opus-cli convert-to-opus-cli is a Python CLI program for converting audio files to opus audio format. Installation Must have installed ffmp

4 Dec 21, 2022
A collection of free MIDI chords and progressions ready to be used in your DAW, Akai MPC, or Roland MC-707/101

A collection of free MIDI chords and progressions ready to be used in your DAW, Akai MPC, or Roland MC-707/101

921 Jan 05, 2023
Audio spatialization over WebRTC and JACK Audio Connection Kit

Audio spatialization over WebRTC Spatify provides a framework for building multichannel installations using WebRTC.

Bruno Gola 34 Jun 29, 2022