C++ library for audio and music analysis, description and synthesis, including Python bindings

Last update: Jan 03, 2023

Overview

Essentia

Essentia is an open-source C++ library for audio analysis and audio-based music information retrieval released under the Affero GPL license. It contains an extensive collection of reusable algorithms which implement audio input/output functionality, standard digital signal processing blocks, statistical characterization of data, and a large set of spectral, temporal, tonal and high-level music descriptors. The library is also wrapped in Python and includes a number of predefined executable extractors for the available music descriptors, which facilitates its use for fast prototyping and allows setting up research experiments very rapidly. Furthermore, it includes a Vamp plugin to be used with Sonic Visualiser for visualization purposes. Essentia is designed with a focus on the robustness of the provided music descriptors and is optimized in terms of the computational cost of the algorithms. The provided functionality, specifically the music descriptors included in-the-box and signal processing algorithms, is easily expandable and allows for both research experiments and development of large-scale industrial applications.

Documentation online: http://essentia.upf.edu

Installation

The library is cross-platform and currently supports Linux, Mac OS X, Windows, iOS and Android systems. Read installation instructions:

You can download and use prebuilt static binaries for a number of Essentia's command-line music extractors instead of installing the complete library

doc/sphinxdoc/extractors_out_of_box.rst

Quick start

Quick start using python:

Command-line tools to compute common music descriptors:

doc/sphinxdoc/extractors_out_of_box.rst

Asking for help

Read frequently asked questions
Create an issue on github if your question was not answered before

Versions

Official releases:

https://github.com/MTG/essentia/releases

Github branches:

master: the most updated version of Essentia (Ubuntu 14.10 or higher, OSX); if you got any problem - try it first.

If you use example extractors (located in src/examples), or your own code employing Essentia algorithms to compute descriptors, you should be aware of possible incompatibilities when using different versions of Essentia.

How to contribute

We are more than happy to collaborate and receive your contributions to Essentia. The best practice of submitting your code is by creating pull requests to our GitHub repository following our contribution policy. By submitting your code you authorize that it complies with the Developer's Certificate of Origin. For more details see: http://essentia.upf.edu/documentation/contribute.html

You are also more than welcome to suggest any improvements, including proposals for new algorithms, etc.

Comments

Remove support for libswresample as we have libavresample
I've installed all of the dependencies that I can uncover, and when I do: $ ./waf configure --mode=release --with-python --with-examples --with-vamp --with-cpptest

I get: Setting top to : /home/roger/AudioSignalProcessing/essentia-2.0.1 Setting out to : /home/roger/AudioSignalProcessing/essentia-2.0.1/build → configuring the project in /home/roger/AudioSignalProcessing/essentia-2.0.1 → Building in release mode Checking for 'g++' (c++ compiler) : /usr/bin/g++ Checking for 'gcc' (c compiler) : /usr/bin/gcc Checking for program pkg-config : /usr/bin/pkg-config Checking for 'libavcodec' : yes Checking for 'libavformat' : yes Checking for 'libavutil' : yes Checking for 'libswresample' : yes Checking for 'taglib' : yes Checking for 'yaml-0.1' : yes Checking for 'fftw3f' : yes Checking for 'samplerate' : yes Checking for 'gaia2' : yes Checking for program python : /usr/bin/python Checking for python version : (2, 7, 6, 'final', 0) Checking for library python2.7 in LIBDIR : yes Checking for program /usr/bin/python-config,python2.7-config,python-config-2.7,python2.7m-config : /usr/bin/python-config Checking for header Python.h : yes ================================ CONFIGURATION SUMMARY

FFmpeg / libav detected! The following algorithms will be included: ['AudioLoader', 'MonoLoader', 'EqloudLoader', 'EasyLoader', 'MonoWriter', 'AudioWriter']

libsamplerate (SRC) detected! The following algorithms will be included: ['Resample']

TagLib detected! The following algorithms will be included: ['MetadataReader']

Gaia2 detected! The following algorithms will be included: ['GaiaTransform']
'configure' finished successfully (1.766s)

But when I do: $ ./waf

I get a bunch of errors. Some are below and all seem to bee related: ../src/essentia/utils/audiocontext.cpp: In member function ‘int essentia::AudioContext::create(const string&, const string&, int, int, int)’: ../src/essentia/utils/audiocontext.cpp:107:10: error: ‘CODEC_ID_PCM_S16LE’ was not declared in this scope case CODEC_ID_PCM_S16LE: ^ ../src/essentia/utils/audiocontext.cpp:108:10: error: ‘CODEC_ID_PCM_S16BE’ was not declared in this scope case CODEC_ID_PCM_S16BE: ^ ../src/essentia/utils/audiocontext.cpp:109:10: error: ‘CODEC_ID_PCM_U16LE’ was not declared in this scope case CODEC_ID_PCM_U16LE: ^ ../src/essentia/utils/audiocontext.cpp:110:10: error: ‘CODEC_ID_PCM_U16BE’ was not declared in this scope case CODEC_ID_PCM_U16BE: ^ and I end up with: Build failed -> task in 'essentia' failed (exit status 1): ...

Can anyone help? I am using Ubuntu 14.04.
bug
opened by rgonnering 30
configuration issue on mac (Getting pyembed flags from python-config: Could not build a python embedded interpreter)

After ./waf configure --mode=release --with-python --with-cpptests --with-examples --with-vamp

I got this

python executable ... differs from system... ... Checking for library python2.7 in LIBPATH_PYEMBED: not found Checking for library python2.7 in LIBDIR: not found Checking for library python2.7 in python_LIBPL: not found Checking for library python2.7 in $prefix/libs: not found ... Getting pyembed flags from python-config: Could not build a python embedded interpreter ...

The configuration failed

any pointer on how to resolve this? thanks.

opened by yyf 28
Probabilistic Yin and CREPE
As the monophonic pitch extraction algorithms in Essentia are out-of-date, it is appealing to implement two state of the art pitch extraction algorithms which lead to better pitch extraction accuracy:

[x] Pyin: https://code.soundsoftware.ac.uk/projects/pyin

[ ] CREPE: https://github.com/marl/crepe

algorithms wishlist
opened by ronggong 21
GaiaTransfrom not found in registry

Hello,

I have compiled and installed first Gaia then Essentia library to my Ubuntu 16.04. I want to use the out of box streaming_extractor_music executable. When I run streaming_extractor_music without any profile I get no problem and a nice output.

However, when I create a profile file that includes: highlevel: compute: 1 svm_models: ['svm_models/genre_tzanetakis.history', 'svm_models/mood_sad.history']

I get GaiaTransform not found in the registry error when it processes the high level svm models.

Any help will be appreciated.

opened by oak94 20
Allow filtering negative energy values

The PredominantPitchMelodia algorithm can return negative confidence values if guessUnvoiced=True. This adds a new option to PitchFilterMakam to automatically take the absolute value of any negative values. Also fix a problem where the octaveFilter parameter wasn't being loaded properly

opened by alastair 19
./waf build fail - TagLib
Hello, I'm trying to run the script ./waf and when I use flags --mode=release --build-static --with-python --with-cpptests --with-examples --with-vamp, I always get stuck at the file metadatareader.cpp. Stacktrace:

[338/374] Linking build/src/examples/essentia_standard_beatsmarker [339/374] Linking build/src/examples/essentia_standard_onsetrate src/libessentia.a(metadatareader.cpp.1.o): In functionformatString(TagLib::StringList const&)': metadatareader.cpp:(.text+0x14f1): undefined reference to TagLib::String::to8Bit(bool) const' metadatareader.cpp:(.text+0x157d): undefined reference toTagLib::String::to8Bit(bool) const' metadatareader.cpp:(.text+0x15b4): undefined reference to TagLib::String::to8Bit(bool) const' src/libessentia.a(metadatareader.cpp.1.o): In functionessentia::standard::MetadataReader::compute()': metadatareader.cpp:(.text+0x2a85): undefined reference to TagLib::String::to8Bit(bool) const' metadatareader.cpp:(.text+0x2c67): undefined reference toTagLib::String::to8Bit(bool) const' collect2: error: ld returned 1 exit status

src/libessentia.a(metadatareader.cpp.1.o): In function `formatString(TagLib::StringList const&)': metadatareader.cpp:(.text+0x14f1): undefined reference to `TagLib::String::to8Bit(bool) const' metadatareader.cpp:(.text+0x157d): undefined reference to `TagLib::String::to8Bit(bool) const' metadatareader.cpp:(.text+0x15b4): undefined reference to `TagLib::String::to8Bit(bool) const' src/libessentia.a(metadatareader.cpp.1.o): In function `essentia::standard::MetadataReader::compute()': metadatareader.cpp:(.text+0x2a85): undefined reference to `TagLib::String::to8Bit(bool) const' metadatareader.cpp:(.text+0x2c67): undefined reference to `TagLib::String::to8Bit(bool) const' collect2: error: ld returned 1 exit status Waf: Leaving directory `/home/kapi/essentia/build' Build failed -> task in 'essentia_standard_beatsmarker' failed with exit status 1 (run with -v to display more information) -> task in 'essentia_standard_onsetrate' failed with exit status 1 (run with -v to display more information)

` I tried installing both the newest (1.11.1) and one of the older (1.9) versions of the TagLib. What can I do to make it work? My operating system is Ubuntu 16.04 LTS.
builds
opened by katpi 16
ConstantQ Transform?

My search in the algorithm reference documentation and a quick search of the repository proved fruitless. Is there an implementation of it (e.g. like this) available in Essentia?
algorithms wishlist

opened by constd 16
cannot import the essentia.standard nor essentia.streaming

when i import essentia it's fine i have no problem but when i try to import the essentia.standard or essentia.streaming i get no module named '..........' i don't know what's the problem

opened by ahmed-jbeli 15
PitchYIN error on stationary signals

Hello

lately we used the YIN implementation in essentia a lot. However for many applications (speech, instruments) I found an constant error compared to other pitch estimators like RAPT.

I tried to produce some more systematic results by running a simple test script (https://gist.github.com/faroit/2ebcf956633f63d92ace) which generates a stationary sine wave of constant f0. The signal then is processed by the YIN algorithm and the mean of the estimate is compared to the (constant) ground truth.

This is what I get:

Obviously the estimation error is frequency depended, which is expected. Over 1 Khz, however, the estimate looks to be unstable.

Did anyone have tested the estimate in comparison to the original C Yin implementation?
bug

opened by faroit 15
Experimental windows support
Hey,

Here's the modifications I did to get things building on Windows with MinGW, with the outcome that with the correct environment setup it should be a case of just supplying these three commands:-

python waf configure --prefix="C:\Program Files (x86)\CodeBlocks\MinGW"

python waf

python waf install

You need to install python and MinGW with pthreads (I used Codeblocks with built in TDM-GCC). During the configure stage it copies the dependencies into bin/include/lib in the MinGW root specified by the prefix option.

I took the built dependencies from the mingw_port and made a few changes:-

I removed the pthread headers from libav as TDM-GCC has them already.

Recompiled libsamplerate to fix def file / dll inconsistency

moved taglib headers down a level in /include/taglib and added missing "tnmap.tcc"
opened by carthach 15
not finding actual directory of libessentia.so

I am new in linux, python and essentia. Using debian jessy

When I call (in python) import essentia:

Traceback (most recent call last): File "", line 1, in File "/usr/local/lib/python2.7/dist-packages/essentia/init.py", line 1, in import _essentia ImportError: libessentia.so: cannot open shared object file: No such file or directory

(I see the file in /usr/local/lib/)

opened by ErnestoAcc 14
Please add FreeBSD install instructions
The Installing Essentia page can have FreeBSD installation instructions: To install essentia's C++ library: pkg install essentia To install essentia's Python binding: pkg install py39-essentia

The FreeBSD ports are now available:

https://cgit.freebsd.org/ports/tree/audio/essentia/Makefile

https://cgit.freebsd.org/ports/tree/audio/py-essentia/Makefile
opened by yurivict 0
libessentia.so does not have a SONAME

When I build the Python binding in the FreeBSD ports framework it complains:

Error: /usr/local/lib/python3.9/site-packages/essentia/_essentia.cpython-39.so is linked to /usr/local/lib/libessentia.so which does not have a SONAME. audio/essentia needs to be fixed.

libessentia.so doesn't have a SONAME fields set.

opened by yurivict 0
using ios_simulator results in an empty lib !
Hello all

I'm on macOs (Ventura 13.0)

I have some difficulties to build essentia for ios-simulator actually I've made all the necessary glue, calling a simple essentia::init() to test the basis.

But XCode is telling me it can find any symbols And indeed, it appears that the resulting lib may be defectuous ?

when doing a ranlib, I've this bad message:

/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib: for architecture: i386 file: build_ios/src/libessentia.a(essentiautil.cpp.1.o) has no symbols /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib: for architecture: x86_64 file: build_ios/src/libessentia.a(essentiautil.cpp.1.o) has no symbols

I join the lib and the log if it can help

Thank you :)
opened by simdax 3

Creating a example for EffnetDiscogs

Hello all

I have to admit I'm not very familiar with AI, so I'm struggling to create a simple example that would work as the other tensorflow examples, musicnn or vggish, in CPP.

In python I have a result with this code:


audio = MonoLoader(filename="../data/raw/blues/blues.00000.wav", sampleRate=16000)()
model = TensorflowPredictEffnetDiscogs(graphFilename="../models/discogs-effnet-bs64-1.pb")
activations = model(audio)

#    [   INFO   ] TensorflowPredict: Successfully loaded graph file: `../models/discogs-effnet-bs64-1.pb`

activations_mean = np.mean(activations, axis=0)
top_n_idx = np.argsort(activations_mean)[::-1][0]

I've just copied these files which are all similar, but it does not seem to work for me sadly.

Anyone could help :) ? Thank you


#include <iostream>
#include <essentia/algorithmfactory.h>
#include <essentia/streaming/algorithms/poolstorage.h>
#include <essentia/scheduler/network.h>
#include "credit_libav.h"

using namespace std;
using namespace essentia;
using namespace essentia::streaming;
using namespace essentia::scheduler;


bool hasFlag(char** begin, char** end, const string& option) {
  return find(begin, end, option) != end;
}

string getArgument(char** begin, char** end, const string& option) {
  char** iter = find(begin, end, option);
  if (iter != end && ++iter != end) return *iter;

  return string();
}

void printHelp(string fileName) {
    cout << "Usage: " << fileName << " pb_graph audio_input output_json [--help|-h] [--list-nodes|-l] [--patchwise|-p] [[-output-node|-o] node_name]" << endl;
    cout << "  -h, --help: print this help" << endl;
    cout << "  -l, --list-nodes: list the nodes in the input graph (model)" << endl;
    cout << "  -p, --patchwise: write out patch-wise predctions (one per patch) instead of averaging them" << endl;
    cout << "  -o, --output-node: node (layer) name to retrieve from the graph (default: model/Sigmoid)" << endl;
    creditLibAV();
}

vector<string> flags({"-h", "--help",
                      "-l", "--list-nodes",
                      "-p", "--patchwise",
                      "-o", "--output-node"});


int main(int argc, char* argv[]) {
  // Sanity check for the command line options.
  for (char** iter = argv; iter < argv + argc; ++iter) {
    if (**iter == '-') {
      string flag(*iter);
      if (find(flags.begin(), flags.end(), flag) == flags.end()){
        cout << argv[0] << ": invalid option '" << flag << "'" << endl;
        printHelp(argv[0]);
        exit(1);
      }
    }
  }

  string outputLayer = "PartitionedCall";

  string graphName = argv[1];
  string audioFilename = argv[2];
  string outputFilename = argv[3];

  // rather to output the patch-wise predictions or to average them.
  const bool average = (hasFlag(argv, argv + argc, "--patchwise") ||
                        hasFlag(argv, argv + argc, "-p")) ? false : true;

  // register the algorithms in the factory(ies)
  essentia::init();

  Pool pool;
  Pool aggrPool;  // a pool for the the aggregated predictions
  Pool* poolPtr = &pool;

  /////// PARAMS //////////////
  Real sampleRate = 16000.0;

  AlgorithmFactory& factory = streaming::AlgorithmFactory::instance();

  Algorithm* audio = factory.create("MonoLoader",
                                    "filename", audioFilename,
                                    "sampleRate", sampleRate);

  Algorithm* tfp   = factory.create("TensorflowPredictEffnetDiscogs",
                                    "graphFilename", graphName,
                                    "output", outputLayer);
  // If the output layer is empty, we have already printed the list of nodes.
  // Exit now.
  if (outputLayer.empty()){
    essentia::shutdown();

    return 0;
  }

  /////////// CONNECTING THE ALGORITHMS ////////////////
  cout << "-------- connecting algos --------" << endl;

  audio->output("audio")     >>  tfp->input("signal");
  tfp->output("predictions") >>  PC(pool, "predictions");


  /////////// STARTING THE ALGORITHMS //////////////////
  cout << "-------- start processing " << audioFilename << " --------" << endl;

  // create a network with our algorithms...
  Network n(audio);
  // ...and run it, easy as that!
  n.run();

  if (average) {
    // aggregate the results
    cout << "-------- averaging the predictions --------" << endl;

    const char* stats[] = {"mean"};

    standard::Algorithm* aggr = standard::AlgorithmFactory::create("PoolAggregator",
                                                                  "defaultStats", arrayToVector<string>(stats));

    aggr->input("input").set(pool);
    aggr->output("output").set(aggrPool);
    aggr->compute();

    poolPtr = &aggrPool;

    delete aggr;
  }

  // write results to file
  cout << "-------- writing results to json file " << outputFilename << " --------" << endl;

  standard::Algorithm* output = standard::AlgorithmFactory::create("YamlOutput",
                                                                   "format", "json",
                                                                   "filename", outputFilename);
  output->input("pool").set(*poolPtr);
  output->compute();
  n.clear();

  delete output;
  essentia::shutdown();

  return 0;
}

compiling it and activating it like these ./build/src/examples/essentia_streaming_discogs test/models/effnetdiscogs/effnetdiscogs-bs64-1.pb test/audio/recorded/mozart_c_major_30sec.wav outpout.json

the result is empty :(

{
"metadata": {
    "version": {
        "essentia": "2.1-beta6-dev"
    }
}
}

Thank you very much :)

opened by simdax 1

Update static builds for Qt 5.15.6
Wishlist of TODOs before merge:

[ ] merge (https://github.com/MTG/gaia/pull/121) and update gaia version in build_config.sh accordingly.

[ ] check if full build for static examples works

builds
opened by dbogdanov 0

Releases(v2.1_beta5)

v2.1_beta5(Sep 5, 2019)
Essentia 2.1 beta5 is our current preliminary version of the forthcoming 2.1 release. This pre-release includes the following changes:

Algorithms updates and bug-fixes

Fix the slaneyMel scale implementation in MelBands and MFCC (#849). Introduced in 2.1-beta4, it was erroneously computing the HTK Mel scale. Set htkMel as the default scale to ensure backward compatibility with all previous versions of MelBands/MFCC.

New option unit_tri for triangle area normalization in MelBands, MFCC, and TriangularBands.

New parameter silenceThreshold in MFCC and GFCC. Set default threshold to 1e-10 (#543).

TriangularBands: faster unit-sum normalization and an improved check for insufficient spectrum resolution (#142).

ConstantQ and the related Chromagram and SpectrumCQ are reimplemented from scratch and now function correctly. The maxFrequency parameter is replaced by numberBins.

New negativeFrequencies parameter in FFTC to include negative frequencies in the output.

New normalize parameter for IFFT size normalization.

FFTC now supports KissFFT and Accelerate.

PoolAggregator: new aggregation method last to get the last value. Fix possible nan/inf values in kurtosis and skewness (#689). Apply aggregation for pool values that contain only one vector too.

New checkRange parameter in Trimmer and StereoTrimmer.

PitchFilter: improve consistency between input and output stream types (#674).

PitchMelodia: fix missing output pitchConfidence in streaming mode.

MultiPitchMelodia: peakFrameThreshold and peakFrameThreshold parameters now work correctly (they were overridden by hardcoded values).

New tolerance parameter in PitchYinFFT. When the pitch confidence is lower than the tolerance value the output pitch is set to 0. A tolerance of 1 disables this feature.

Fix occasional negative values output by Danceability (#483).

LoudnessEBUR128:

Fix memory leaks and warnings on empty input. Set a larger internal buffer size to avoid buffer resizes.

New parameter startFromZero to zero-center the first window for loudness estimation.

Fix a memory leak in AudioLoader.

BeatTrackerDegara output is now deterministic (#860).

ChordDetectionBeats: add new parameter chromaPick and fix a beat segment indexing bug in the case of very close consecutive beats.

New minPeakDistance parameter in PeakDetection.

Fix invalid memory access in PCA (#727).

Update Key and KeyExtractor algorithms with new pitch class profiles and new parameters for detuning correction and low-energy HPCP bin thresholding. Use the new bgate profile by default. Add spectral whitening step to KeyExtractor. Change output key naming. Add a new function equivalentKey to match between equivalent names.

Proper mutex implementation for all FFT* algorithms.

New algorithms

Invertible Constant-Q based on Non-Stationary Gabor frames: NSGConstantQ, NSGIConstantQ, NSGConstantQStreaming.

Chromaprinter (fingerprinting) wrapper for the Chromaprint library.

NNLSChroma and LogSpectrum (derived from the original NNLS Chroma code).

TriangularBarkBands (more configurable than BarkBands) and BFCC (bark-frequency cepstrum coefficients).

New algorithms for audio problems detection: ClickDetector, DiscontinuityDetector, FalseStereoDetector, GapsDetector, HumDetector, NoiseBurstDetector, SNR, SaturationDetector, StartStopCut, TruePeakDetector.

New algorithms for probabilistic Yin (pYIN) pitch estimation: PitchYinProbabilistic, PitchYinProbabilities, PitchYinProbabilitiesHMM.

StereoTrimmer and StereoMuxer.

Welch (power spectral density estimation).

New algorithm IFFTC for inverse complex STFT.

Histogram.

Updated music and sound feature extractors streaming_extractor_music and streaming_extractor_freesound. Both extractors are now also available as algorithms: MusicExtractor and FreesoundExtractor. New MusicExtractorSVM algorithm allows applying SVM models to the output of MusicExtractor.

Fix possible memory leaks in MusicExtractor

Proper logging for "out of memory" errors

Skip aggregation for some descriptors

Add audio length to metadata and remove end_time

Add number of audio channels to metadata (number_channels)

Better grouping of metadata related to audio analysis

Updated key/chords estimation parameters

Estimate key using three different key profiles (temperley, krumhansl, edma)

Updated descriptors in MusicExtractor:

New LoudnessEBU128 loudness descriptors

Add melbands128 high-resolution melbands

Compute hpcp_crest

Compute bpm_histogram

New stdev aggregate statistics in addition to var

Updated descriptors in FreesoundExtractor

Add melbands96 high-resolution melbands

Add stdev statistic

Remove frequency_bands

Do not output bpm_confidence when configured to use 'degara' for beat tracking

spectral_contrast and scvalleys are now called spectral_contrast_coeffs and spectral_contrast_valleys for consistency with MusicExtractor

startFrame and stopFrame are now called sound_start_frame and sound_stop_frame

New extractors

Add a new extractor for spectrograms and log-energy Mel-spectrograms (streaming_spectrogram).

Python bindings updates

Add support for Python 3.

Update all tutorials and code examples to Python 3.

New essentia.pyutils submodule provides useful functions for a number of use-cases (spectrograms, CQ-grams, batch processing with extractors, etc.)

Fix a memory bug in Pool on a isSingleValue check in Python.

Faster VECTOR_VECTOR_REAL conversion from Python types.

Build scripts updates

Add script for Python packaging (python.py) and wheels.

Travis CI and build scripts for manylinux wheels.

Update Waf to 2.0.10.

The code is now partly C++11.

Build flags for MSVC.

Fixes for cross-compilation with Mingw-w64.

Default --prefix=$VIRTUAL_ENV when inside a virtualenv.

Read PKG_CONFIG_PATH and add new flag --pkg-config-path for custom lib paths.

New flag --only-python to build Python extension separately from libessentia.

Link only to libessentia when building examples.

Generate a proper essentia.pc pkg-config file.

Static builds updates.

Replace LibAv with FFmpeg, build with muxers.

Update Taglib version to 1.11.1, build with zlib.

Update Gaia to 2.4.5.

Miscellaneous

Fix segfault in the Vamp plugin (#635, #371).

Add support for SingleVectorString to Pool.

Added support for Cephes Bessel functions via a 3rdparty library Cephes.

Updated documentation, tutorials, and examples including a significant web redesign.

Improve build scripts for documentation.

Every algorithm page now has links to related algorithms.

An updated list of research works using Essentia.

New python examples.

New QA scripts for audio problems detection and HPCPs.

A usual assortment of code cleanup, updated and expanded unit tests, and better logging (more informative log and exception messages).

Source code(tar.gz)
Source code(zip)
v2.1_beta4(May 23, 2018)
This pre-release includes the following changes:

Improved algorithms

AudioLoader now supports audio sources with multiple audio streams (new parameter 'audioStream')

PoolAggregator now outputs stdev in addition to var (#342)

SpectralContrast: Improve precision for computation of subband bin intervals

Danceability now also outputs a DFA exponent vector

HPCP can now optionally apply unit sum normalization (#348)

HPCP: 'splitFrequency' parameter is now called 'bandSplitFrequency'

LoudnessEBUR128: Warn on empty input in the streaming mode

Updates to Mel and ERB energy band algorithms

Add support for extracting MelBands and MFCCs 'the htk way'

Add support for DCT type III in DCT algorithm

New parameter 'dctType' in DCT, MFCC and GFCC

New 'liftering' parameter in DCT and MFCC

New parameters 'normalize', 'type', 'scale' and 'weighting' in MelBands and MFCC

New 'type' parameter in GFCC

New 'logType' parameter in MFCC, GFCC

New 'log' parameter in TriangularBands and MelBands

ERBBands: 'type' parameter value "energy" is now called "power"

TriangularBands is now faster

New algorithms

SpectrumToCent for computing cent scale from frequency bins

New algorithm IDCT for inverse DCT

New algorithm SpectrumCQ

Bug-fixes in algorithms:

MelBands and TriangularBands: Add checks for insufficient spectrum resolution (#142)

Fix PitchYin out of range error (#376)

Fix Inf values in OddToEvenHarmonicEnergyRatio

Fix reset() in LowLevelSpectralExtractor and LowLevelSpectralEqloudExtractor

Fix occasional exception in BeatsLoudness (#199)

Danceability: Fix NaN danceability value occurring on very short input signals

Fix memory leak in MelBands

Fix memory bug in Vibrato

SpectralContrast: Force non-zero 'lowFrequencyBound' parameter to avoid division by zero (#568)

AudioLoader: Fix memory bug on exceptions while opening an audio file in AudioLoader

Updates to Python wrapper:

FrameGenerator now inherits the default parameters from FrameCutter

FrameGenerator now has a new method frame_times() to compute frame positions in time

Fix array memory corruption when passing NumPy array views to Essentia algorithms (#240)

Fix memory deallocation for streaming algorithms to avoid a memory leak

Extractors:

Freesound extractor now stores all results in json

Logging:

Remove colors in log messages when piped to file; do not print colors on Windows

Build scripts updates:

Update waf to 1.9.5

Update script for computing algorithm dependencies

Code cleanup and unit tests updates

Re-designed and expanded documentation:

Updated installation instructions

Reorganized and improved Python tutorials. Notebook tutorials are now also rendered as html

Updated algorithm descriptions

Added examples of industrial applications and academic studies using Essentia

Source code(tar.gz)
Source code(zip)
v2.1_beta3(Sep 29, 2016)
This pre-release includes the following changes:

Build script updates:

Cross-compilation for iOS and Android

Support for javascript using Emscripten

Updated dependencies in static extractors (LibAv 11.2, Taglib 1.10)

Fixed cross-compilation for Windows

Homebrew formula for easy installation on OSX

Updated Debian packaging

All dependencies are now optional. Algorithms and examples relying on missing dependencies will be ignored.

New flags for building lightweight versions of Essentia

--lightweight=LIBS to specify dependencies to be included

--include-algos=ALGOS and --ignore-algos=ALGOS to specify algorithms to be included

New algorithms:

SuperFlux algorithm for real-time onset detection (SuperFluxExtractor, SuperFluxNovelty)

Algorithms for sound modeling

Overlap-add (OverlapAdd)

Sine model analysis/synthesis (SineModelAnal, SineModelSynth)

Sine subtraction (SineSubtraction)

Sinusoidal plus Residual model analysis/synthesis (SprModelAnal, SprModelSynth)

Melody Analysis (monophonic/predominant)

HarmonicMask

Signal resampling (ResampleFFT)

New pitch-related algorithms

Multi-pitch estimation in polyphonic music (MultiPitchKlapuri, MultiPitchMelodia)

Adaptation of Melodia algorithm for monophonic signals (PitchMelodia)

Yin pitch detection algorithm (PitchYin)

Pitch contour segmentation into notes (PitchContourSegmentation)

Vibrato detection (Vibrato)

BPM estimation on loops (PercivalEnhanceHarmonics, PercivalEvaluatePulseTrains, LoopBpmConfidence, LoopBpmEstimator, PercivalBpmEstimator)

STFT on complex inputs ( FFTC)

ConstantQ and Chromagram (still in experimental stage)

TriangularBands

Lightweight spectral centroid implementation (SpectralCentroidTime)

Chords detection on beat segments (ChordsDetectionBeats)

VectorRealAccumulator

Improved algorithms:

LoudnessEBUR128 algorithms are now finalized (includes bug-fixes)

FFT now supports KissFFT and Accelerate FFT libraries as an alternative to FFTW

New profiles for Key estimation (including profiles for electronic music)

New 'generalized' parameter in Autocorrelation algorithm

New 'scale' and 'shift' parameters in UnaryOperator algorithm

New 'normalized' parameter in Windowing algorithm

New 'inputSize' parameter in GFCC algorithm

Added support for 8kHz for EqualLoudness algorithm

LogAttackTime now outputs attack times

BpmHistogramDescriptors now outputs a complete histogram

ChordsDescriptors now throws exception on incorrect chords

Refactored AudioLoader and AudioWriter algorithms. Use libavresample, remove support for libswresample

Rename PitchFilterMakam to PitchFilter. Allow filtering negative energy values. Remove optional 'octaveFilter' parameter

Rename PredominantMelody algorithm to PredominantPitchMelodia

Bug-fixes:

Fix wrong behavior of HarmonicPeaks that was indirectly affecting results in HPCP, Key, Tristimulus and OddToEvenHarmonicEnergy

Fixed filter coefficients in BandReject and BandPass

Fixed weightings in NoveltyCurve

Different key profiles in Key streaming algorithm now work correctly

Bug fixes in Envelope, TonicIndianArtMusic, RhythmExtractor2013, PitchYinFFT, BpmHistogramDescriptors, ReplayGain streaming

Updated extractors (including Freesound extractor)

Improved documentation

Fresh new design

Algorithms are now organized by categories.

Improved and rewritten algorithm descriptions

New python examples and tutorials

More minor fixes, improvements and code cleanup

Updated unit tests. Audio files for tests are now hosted in a separate repository

Known issues:

Some unit tests fail (#316)

Source code(tar.gz)
Source code(zip)
v2.1_beta2(Mar 26, 2015)
Changes:

Build scripts updates:

New scripts for static builds on Linux, OSX and (cross-compilation) Windows

New flag --with-example to build only specific examples

New git commit SHA hash value accessible via Essentia library API for better versioning

Algorithm updates:

AudioLoader now outputs codec and bitrate, and computes md5 hash values over undecoded audio

MetadataReader now uses new TagLib 1.9 API and is able to read any tags

YamlInput now supports json

New Entropy algorithm

EffectiveDuration now accepts a threshold parameter

Fixed incorrect computation of onset rate in OnsetRate

New algorithm LoudnessEBUR128 for measuring loudness according to the EBU R128 standard (still in experimental stage)

New BinaryOperator algo

PitchYinFFT algorithm now includes peak interpolation

Revised and updated extractors:

Revised, refactored and expanded music extractor (streaming_extractor_music) including new functionality and descriptors

Updated Freesound extractor, including new descriptors

Some updates in core Essentia code

Updated documentation and examples

Bugfixes and unit tests updates

Dependencies: Libav 9, Taglib 1.9

Ubuntu/Debian Libav/Taglib compatibility:

Debian Jessie - the required package versions are already in the repository

Debian Wheezy - install libav/libtag1-dev packages from wheezy-backports repository

libav 6:10.1

libtag1-dev 1.9.1

Ubuntu Trusty (14.04 LTS), Utopic (14.10) and Vivid (15.04) - the required package versions are already in the repository

Source code(tar.gz)
Source code(zip)
v2.0.1(Feb 11, 2014)
Essentia 2.0.1:

Added pre-trained high-level classifier models for genres, moods, rhythm and instrumentation (to be used with streaming_extractor_archivemusic extractor, see accuracies here)

Fixed scheduler in streaming mode

Fixed compilation with clang/libc++/c++11

PitchYinFFT now supports parabolic interpolation

Updated Vamp plugin

Updated documentation and tutorials

Minor bugfixes, more unittests, etc.

For post-release bugfixes (including Ubuntu 14.04 compatibility) use the 2.0.1 branch.

Ubuntu/Debian Libav compatibility:

Debian Wheezy - libav 6:0.8.17

Ubuntu Precise (12.04 LTS) - libav 4:0.8.17

Ubuntu Trusty (14.04 LTS) - libav 6:9.18

Source code(tar.gz)
Source code(zip)
v2.0(Mar 31, 2015)
First release to be publicly available as free software released under AGPLv3

Refactoring of the core API

fix small API annoyances for the standard mode

streaming mode refactor. It is now much better defined, using sound computer science techniques (The visible network is a directed acyclic graph, the composites have better defined semantics, and the order of execution of the algorithms is the topological sort of the transitive reduction of the visible network after the composites have been expanded). In particular, the scheduler that runs the algorithms in the streaming mode is now a lot more correct, which permitted to clean all the small hacks that had accumulated in the algorithms themselves during the 1.x releases to compensate for the deficiencies of the initial scheduler.

New algorithms for onset detection, beat tracking and melody extraction

New and updated features extractors

Updated Vamp plugin

Much better documentation, more python examples

Bugfixes, more unittests, etc.

For post-release bugfixes use the 2.0 branch.

Ubuntu/Debian Libav compatibility:

Debian Wheezy - libav 6:0.8.17

Ubuntu Precise (12.04 LTS) - libav 4:0.8.17

Ubuntu Trusty (14.04 LTS) - libav 6:9.18

Source code(tar.gz)
Source code(zip)

Owner

Music Technology Group - Universitat Pompeu Fabra

Software tools developed by the MTG

GitHub Repository http://essentia.upf.edu

Anki vector Music ❤ is the best and only Telegram VC player with playlists, Multi Playback, Channel play and more

Anki Vector Music 🎵 A bot that can play music on Telegram Group and Channel Voice Chats Available on telegram as @Anki Vector Music Features 🔥 Thumb

12 Nov 12, 2022

Speech Algorithms Collections

498 Jan 06, 2023

Marsyas - Music Analysis, Retrieval and Synthesis for Audio Signals

Welcome to MARSYAS. MARSYAS is a software framework for rapid prototyping of audio applications, with flexibility and extensibility as primary concer

364 Oct 31, 2022

Port Hitsuboku Kumi Chinese CVVC voicebank to deepvocal. / 筆墨クミDeepvocal中文音源

Hitsuboku Kumi (筆墨クミ) is a UTAU virtual singer developed by Cubialpha. This project ports Hitsuboku Kumi Chinese CVVC voicebank to deepvocal. This is the first open-source deepvocal voicebank on Gith

8 Apr 26, 2022

Python tools for the corpus analysis of popular music.

CATCHY Corpus Analysis Tools for Computational Hook discovery Python tools for the corpus analysis of popular music recordings. The tools can be used

20 Aug 20, 2022

PyAbsorp is a python module that has the main focus to help estimate the Sound Absorption Coefficient.

This is a package developed to be use to find the Sound Absorption Coefficient through some implemented models, like Biot-Allard, Johnson-Champoux and

8 Oct 19, 2022

Python implementation of the Short Term Objective Intelligibility measure

Python implementation of STOI Implementation of the classical and extended Short Term Objective Intelligibility measures Intelligibility measure which

250 Dec 21, 2022

An audio guide for destroying oracles in Destiny's Vault of Glass raid

prophet An audio guide for destroying oracles in Destiny's Vault of Glass raid. This project allows you to make any encounter with oracles without hav

24 Sep 15, 2022

Music Streaming Platform based on full implementation of DBSM

Symphony Music Streaming Platform based on full implementation of DBSM List of Commands Insert User (INSERT) Function to implement input in USER Get a

1 Nov 12, 2021

Voicefixer aims at the restoration of human speech regardless how serious its degraded.

324 Dec 26, 2022

🎵 A music bot for discord servers!

music bot A music bot for Discord Servers Features Play songs in your discord server Get the lyrics without going on a web explorer Commands Command P

1 Jul 25, 2022

Official implementation of A cappella: Audio-visual Singing VoiceSeparation, from BMVC21

Y-Net Official implementation of A cappella: Audio-visual Singing VoiceSeparation, British Machine Vision Conference 2021 Project page: ipcv.github.io

12 Oct 22, 2022

Reading list for research topics in sound event detection

Sound event detection aims at processing the continuous acoustic signal and converting it into symbolic descriptions of the corresponding sound events present at the auditory scene.

64 Jan 05, 2023

A GUI-based audio player with support for a large variety of formats

Miza-Player A GUI-based audio player with support for a large variety of formats, able to play from web-hosted media platforms such as YouTube, includ

3 Dec 14, 2022

We built this fully functioning Music player in Python. The music player allows you to play/pause and switch to different songs easily.

1 Nov 19, 2021

C++ library for audio and music analysis, description and synthesis, including Python bindings

Related tags

Overview

Essentia

Installation

Quick start

Asking for help

Versions

How to contribute

Comments

Gaia2 detected! The following algorithms will be included: ['GaiaTransform']

I got this

The configuration failed

Releases(v2.1_beta5)

v2.1_beta5(Sep 5, 2019)

v2.1_beta4(May 23, 2018)

v2.1_beta3(Sep 29, 2016)

v2.1_beta2(Mar 26, 2015)

v2.0.1(Feb 11, 2014)

v2.0(Mar 31, 2015)

Owner

Music Technology Group - Universitat Pompeu Fabra

Anki vector Music ❤ is the best and only Telegram VC player with playlists, Multi Playback, Channel play and more

Speech Algorithms Collections

Marsyas - Music Analysis, Retrieval and Synthesis for Audio Signals

Port Hitsuboku Kumi Chinese CVVC voicebank to deepvocal. / 筆墨クミDeepvocal中文音源

Python tools for the corpus analysis of popular music.

PyAbsorp is a python module that has the main focus to help estimate the Sound Absorption Coefficient.

Python implementation of the Short Term Objective Intelligibility measure

An audio guide for destroying oracles in Destiny's Vault of Glass raid

Music Streaming Platform based on full implementation of DBSM

Voicefixer aims at the restoration of human speech regardless how serious its degraded.

🎵 A music bot for discord servers!

Official implementation of A cappella: Audio-visual Singing VoiceSeparation, from BMVC21

Reading list for research topics in sound event detection

A GUI-based audio player with support for a large variety of formats

We built this fully functioning Music player in Python. The music player allows you to play/pause and switch to different songs easily.

Python interface to the WebRTC Voice Activity Detector

SinGlow: Generative Flow for SVS tasks in Tensorflow 2

Audio pitch-shifting & re-sampling utility, based on the EMU SP-1200

C++ library for audio and music analysis, description and synthesis, including Python bindings

Open Sound Strip, Sequence or Record in Audacity