Analysis of voices based on the Mel-frequency band

Last update: Feb 06, 2022

Related tags

Audio Speaker_partition_module

Overview

Speaker_partition_module

Analysis of voices based on the Mel-frequency band.
Goal: Identification of voices speaking (diarization) and calculation of speech partition (in %).

Methodology:

Collect voice data
Sample audio data of x speakers that talk y times to represent a round of people talking
Annotate samples with labels and merge audio file
Create train & test split of samples
Train unsupervised clustering module to detect number of people
Train supervised RNN classifier to determine who is speaking at time x

Preprocessing

Convert files to .wav convertFlac2Wav.py
Collect data via LibriSpeech voices library (audiofiles) audio_manipulation02.py
Extract x random speakers with y audio samples per speaker Result: Generated audio samples of length 30-60 seconds

Feature extraction:

Create mel-frequency spectrum for each audio file feature_extraction.py
Define overlapping feature window for training

Training:

Implementation of google-diarizer module
Training accuracy is only at 40 %

Further activity

Create own unsupervised clustering module
Try out different libraries

Owner

GitHub Repository

Free and Open Source Channel/Group Voice chat music player for telegram with button support saavn playback support.

A bot that can play music on Telegram Group and Channel Voice Chats

1 Oct 27, 2021

Extract the songs from your osu! libary into proper mp3 form, complete with metadata and album art!

osu-Extract Extract the songs from your osu! libary into proper mp3 form, complete with metadata and album art! Requirements python3 mutagen pillow Us

2 Mar 09, 2022

Code for "Audio-driven Talking Face Video Generation with Learning-based Personalized Head Pose"

Audio-driven Talking Face Video Generation with Learning-based Personalized Head Pose We provide PyTorch implementations for our arxiv paper "Audio-dr

497 Jan 09, 2023

Synchronize a local directory of songs' (MP3, MP4) metadata (genre, ratings) and playlists with a Plex server.

PlexMusicSync Synchronize a local directory of songs' (MP3, MP4) metadata (genre, ratings) and playlists (m3u, m3u8) with a Plex server. The song file

9 Jul 07, 2022

SomaFM Plugin for Kodi

SomaFM XBMC Plugin This description is a bit outdated. You can simply install this addon by browsing the official repositories from within Kodi. Insta

7 Jan 21, 2022

A collection of free MIDI chords and progressions ready to be used in your DAW, Akai MPC, or Roland MC-707/101

A collection of free MIDI chords and progressions ready to be used in your DAW, Akai MPC, or Roland MC-707/101

921 Jan 05, 2023

Simple discord bot by @merive 🤖

Parzibot Powerful and Useful Discord Bot on Python. The source code of the bot is available to everyone. Parzibot uses English language. This is free

3 Dec 28, 2022

An audio digital processing toolbox based on a workflow/pipeline principle

AudioTK Audio ToolKit is a set of audio filters. It helps assembling workflows for specific audio processing workloads. The audio workflow is split in

238 Oct 18, 2022

:sound: Play and Record Sound with Python :snake:

Play and Record Sound with Python This Python module provides bindings for the PortAudio library and a few convenience functions to play and record Nu

750 Dec 31, 2022

All-In-One Digital Audio Workstation and Plugin Suite

How to install Windows Mac OS X Fedora Ubuntu How to Build Debian and Ubuntu Fedora All Other Linux Distros Mac OS X Windows What is MusiKernel? MusiK

111 Sep 21, 2021

Implicit neural differentiable FM synthesizer

Implicit neural differentiable FM synthesizer The purpose of this project is to emulate arbitrary sounds with FM synthesis, where the parameters of th

34 Nov 06, 2022

Sequencer: Deep LSTM for Image Classification

Sequencer: Deep LSTM for Image Classification Created by Yuki Tatsunami Masato Taki This repository contains implementation for Sequencer. Abstract In

111 Dec 16, 2022

Audio2midi - Automatic Audio-to-symbolic Arrangement

Automatic Audio-to-symbolic Arrangement This is the repository of the project "A

24 Dec 05, 2022

Improved Python UI to convert Youtube URL to .mp3 file.

YT-MP3 Improved Python UI to convert Youtube URL to .mp3 file. How to use? Just run python3 main.py Enter the URL of the video Enter the PATH of where

8 Jun 19, 2022

Code for paper 'Audio-Driven Emotional Video Portraits'.

Audio-Driven Emotional Video Portraits [CVPR2021] Xinya Ji, Zhou Hang, Kaisiyuan Wang, Wayne Wu, Chen Change Loy, Xun Cao, Feng Xu [Project] [Paper] G

197 Dec 31, 2022

XA Music Player - Telegram Music Bot

XA Music Player Requirements 📝 FFmpeg (Latest) NodeJS nodesource.com (NodeJS 17+) Python (3.10+) PyTgCalls (Lastest) MongoDB (3.12.1) 2nd Telegram Ac

3 Jun 30, 2022

Small Python application that links a Digico console and Reaper, handling automatic marker insertion and tracking.

Digico-Reaper-Link This is a small GUI based helper application designed to help with using Digico's Copy Audio function with a Reaper DAW used for re

10 Oct 24, 2022

Codes for "Efficient Long-Range Attention Network for Image Super-resolution"

ELAN Codes for "Efficient Long-Range Attention Network for Image Super-resolution", arxiv link. Dependencies & Installation Please refer to the follow

124 Dec 22, 2022

Pyroomacoustics is a package for audio signal processing for indoor applications. It was developed as a fast prototyping platform for beamforming algorithms in indoor scenarios.

Summary Pyroomacoustics is a software package aimed at the rapid development and testing of audio array processing algorithms. The content of the pack

1k Jan 09, 2023

python script for getting mp3 files from yaoutube playlist

mp3-from-youtube-playlist python script for getting mp3 files from youtube playlist. Do your non-tech brown relatives ask you for downloading music fr

7 Oct 19, 2022