Gammatone-based spectrograms, using gammatone filterbanks or Fourier transform weightings.

Related tags

Audiogammatone
Overview

Gammatone Filterbank Toolkit

Utilities for analysing sound using perceptual models of human hearing.

Jason Heeris, 2013

Summary

This is a port of Malcolm Slaney's and Dan Ellis' gammatone filterbank MATLAB code, detailed below, to Python 2 and 3 using Numpy and Scipy. It analyses signals by running them through banks of gammatone filters, similar to Fourier-based spectrogram analysis.

Gammatone-based spectrogram of Für Elise

Installation

You can install directly from this git repository using:

pip install git+https://github.com/detly/gammatone.git

...or you can clone the git repository however you prefer, and do:

pip install .

...or:

python setup.py install

...from the cloned tree.

Dependencies

  • numpy
  • scipy
  • nose
  • mock
  • matplotlib

Using the Code

See the API documentation. For a demonstration, find a .wav file (for example, Für Elise) and run:

python -m gammatone FurElise.wav -d 10

...to see a gammatone-gram of the first ten seconds of the track. If you've installed via pip or setup.py install, you should also be able to just run:

gammatone FurElise.wav -d 10

Basis

This project is based on research into how humans perceive audio, originally published by Malcolm Slaney:

Malcolm Slaney (1998) "Auditory Toolbox Version 2", Technical Report #1998-010, Interval Research Corporation, 1998.

Slaney's report describes a way of modelling how the human ear perceives, emphasises and separates different frequencies of sound. A series of gammatone filters are constructed whose width increases with increasing centre frequency, and this bank of filters is applied to a time-domain signal. The result of this is a spectrum that should represent the human experience of sound better than, say, a Fourier-domain spectrum would.

A gammatone filter has an impulse response that is a sine wave multiplied by a gamma distribution function. It is a common approach to modelling the auditory system.

The gammatone filterbank approach can be considered analogous (but not equivalent) to a discrete Fourier transform where the frequency axis is logarithmic. For example, a series of notes spaced an octave apart would appear to be roughly linearly spaced; or a sound that was distributed across the same linear frequency range would appear to have more spread at lower frequencies.

The real goal of this toolkit is to allow easy computation of the gammatone equivalent of a spectrogram — a time-varying spectrum of energy over audible frequencies based on a gammatone filterbank.

Slaney demonstrated his research with an initial implementation in MATLAB. This implementation was later extended by Dan Ellis, who found a way to approximate a "gammatone-gram" by using the fast Fourier transform. Ellis' code calculates a matrix of weights that can be applied to the output of a FFT so that a Fourier-based spectrogram can easily be transformed into such an approximation.

Ellis' code and documentation is here: Gammatone-like spectrograms

Interest

I became interested in this because of my background in science communication and my general interest in the teaching of signal processing. I find that the spectrogram approach to visualising signals is adequate for illustrating abstract systems or the mathematical properties of transforms, but bears little correspondence to a person's own experience of sound. If someone wants to see what their favourite piece of music "looks like," a normal Fourier transform based spectrogram is actually quite a poor way to visualise it. Features of the audio seem to be oddly spaced or unnaturally emphasised or de-emphasised depending on where they are in the frequency domain.

The gammatone filterbank approach seems to be closer to what someone might intuitively expect a visualisation of sound to look like, and can help develop an intuition about alternative representations of signals.

Verifying the port

Since this is a port of existing MATLAB code, I've written tests to verify the Python implementation against the original code. These tests aren't unit tests, but they do generally test single functions. Running the tests has the same workflow:

  1. Run the scripts in the test_generation directory. This will create a .mat file containing test data in tests/data.

  2. Run nosetest3 in the top level directory. This will find and run all the tests in the tests directory.

Although I'm usually loathe to check in generated files to version control, I'm willing to make an exception for the .mat files containing the test data. My reasoning is that they represent the decoupling of my code from the MATLAB code, and if the two projects were separated, they would be considered a part of the Python code, not the original MATLAB code.

Owner
Jason Heeris
Jason Heeris
A voice assistant which can be used to interact with your computer and controls your pc operations

Introduction 👨‍💻 It is a voice assistant which can be used to interact with your computer and also you have been seeing it in Iron man movies, but t

Sujith 84 Dec 22, 2022
BART aids transcribe tasks by taking a source audio file and creating automatic repeated loops, allowing transcribers to listen to fragments multiple times

BART (Beyond Audio Replay Technology) aids transcribe tasks by taking a source audio file and creating automatic repeated loops, allowing transcribers to listen to fragments multiple times (with poss

2 Feb 04, 2022
A GUI-based audio player with support for a large variety of formats

Miza-Player A GUI-based audio player with support for a large variety of formats, able to play from web-hosted media platforms such as YouTube, includ

Thomas Xin 3 Dec 14, 2022
:speech_balloon: SpeechPy - A Library for Speech Processing and Recognition: http://speechpy.readthedocs.io/en/latest/

SpeechPy Official Project Documentation Table of Contents Documentation Which Python versions are supported Citation How to Install? Local Installatio

Amirsina Torfi 870 Dec 27, 2022
PatrikZero's CS:GO Hearing protection

Program that lowers volume when you die and get flashed in CS:GO. It aims to lower the chance of hearing damage by reducing overall sound exposure. Uses game state integration. Anti-cheat safe.

Patrik Žúdel 224 Dec 04, 2022
Code to work with wave files!

Code to work with wave files!

Mohammad Dori 3 Jul 15, 2022
DeepMusic is an easy to use Spotify like app to manage and listen to your favorites musics.

DeepMusic is an easy to use Spotify like app to manage and listen to your favorites musics. Technically, this project is an Android Client and its ent

Labrak Yanis 1 Jul 12, 2021
NovaMusic is a music sharing robot. Users can get music and music lyrics using inline queries.

A music sharing telegram robot using Redis database and Telebot python library using Redis database.

Hesam Norin 7 Oct 21, 2022
This library provides common speech features for ASR including MFCCs and filterbank energies.

python_speech_features This library provides common speech features for ASR including MFCCs and filterbank energies. If you are not sure what MFCCs ar

James Lyons 2.2k Jan 04, 2023
Make an audio file (really) long-winded

longwind Make an audio file (really) long-winded Daily repetitions are an illusion anyway.

Vincent Lostanlen 2 Sep 12, 2022
Pyroomacoustics is a package for audio signal processing for indoor applications. It was developed as a fast prototyping platform for beamforming algorithms in indoor scenarios.

Summary Pyroomacoustics is a software package aimed at the rapid development and testing of audio array processing algorithms. The content of the pack

Audiovisual Communications Laboratory 1k Jan 09, 2023
Tradutor de um arquivo MIDI para ser usado em um simulador RISC-V(RARS)

Tradutor_MIDI-RISC-V Tradutor de um arquivo MIDI para ser usado em um simulador RISC-V(RARS) *O resultado sai com essa formatação: nota,duração,nota,d

Gabriel B. G. 4 Sep 02, 2022
Analyze, visualize and process sound field data recorded by spherical microphone arrays.

Sound Field Analysis toolbox for Python The sound_field_analysis toolbox (short: sfa) is a Python port of the Sound Field Analysis Toolbox (SOFiA) too

Division of Applied Acoustics at Chalmers University of Technology 69 Nov 23, 2022
Synchronize a local directory of songs' (MP3, MP4) metadata (genre, ratings) and playlists with a Plex server.

PlexMusicSync Synchronize a local directory of songs' (MP3, MP4) metadata (genre, ratings) and playlists (m3u, m3u8) with a Plex server. The song file

Tom Goetz 9 Jul 07, 2022
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

⚠️ Checkout develop branch to see what is coming in pyannote.audio 2.0: a much smaller and cleaner codebase Python-first API (the good old pyannote-au

pyannote 2.1k Dec 31, 2022
PyAbsorp is a python module that has the main focus to help estimate the Sound Absorption Coefficient.

This is a package developed to be use to find the Sound Absorption Coefficient through some implemented models, like Biot-Allard, Johnson-Champoux and

Michael Markus Ackermann 8 Oct 19, 2022
All-In-One Digital Audio Workstation and Plugin Suite

How to install Windows Mac OS X Fedora Ubuntu How to Build Debian and Ubuntu Fedora All Other Linux Distros Mac OS X Windows What is MusiKernel? MusiK

j3ffhubb 111 Sep 21, 2021
MIDI-DDSP: Detailed Control of Musical Performance via Hierarchical Modeling

MIDI-DDSP: Detailed Control of Musical Performance via Hierarchical Modeling Demos | Blog Post | Colab Notebook | Paper | MIDI-DDSP is a hierarchical

Magenta 239 Jan 03, 2023
This Is Telegram Music UserBot To Play Music Without Being Admin

This Is Telegram Music UserBot To Play Music Without Being Admin

Krishna Kumar 36 Sep 13, 2022
Improved Python UI to convert Youtube URL to .mp3 file.

YT-MP3 Improved Python UI to convert Youtube URL to .mp3 file. How to use? Just run python3 main.py Enter the URL of the video Enter the PATH of where

8 Jun 19, 2022