kapre: Keras Audio Preprocessors

Last update: Dec 29, 2022

Overview

Kapre

Keras Audio Preprocessors - compute STFT, ISTFT, Melspectrogram, and others on GPU real-time.

Tested on Python 3.6 and 3.7

Why Kapre?

vs. Pre-computation

You can optimize DSP parameters
Your model deployment becomes much simpler and consistent.
Your code and model has less dependencies

vs. Your own implementation

Quick and easy!
Consistent with 1D/2D tensorflow batch shapes
Data format agnostic (channels_first and channels_last)
Less error prone - Kapre layers are tested against Librosa (stft, decibel, etc) - which is (trust me) trickier than you think.
Kapre layers have some extended APIs from the default tf.signals implementation such as..
- A perfectly invertible STFT and InverseSTFT pair
- Mel-spectrogram with more options
Reproducibility - Kapre is available on pip with versioning

Workflow with Kapre

Preprocess your audio dataset. Resample the audio to the right sampling rate and store the audio signals (waveforms).
In your ML model, add Kapre layer e.g. kapre.time_frequency.STFT() as the first layer of the model.
The data loader simply loads audio signals and feed them into the model
In your hyperparameter search, include DSP parameters like n_fft to boost the performance.
When deploying the final model, all you need to remember is the sampling rate of the signal. No dependency or preprocessing!

Installation

pip install kapre

API Documentation

Please refer to Kapre API Documentation at https://kapre.readthedocs.io

One-shot example

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, BatchNormalization, ReLU, GlobalAveragePooling2D, Dense, Softmax
from kapre import STFT, Magnitude, MagnitudeToDecibel
from kapre.composed import get_melspectrogram_layer, get_log_frequency_spectrogram_layer

# 6 channels (!), maybe 1-sec audio signal, for an example.
input_shape = (44100, 6)
sr = 44100
model = Sequential()
# A STFT layer
model.add(STFT(n_fft=2048, win_length=2018, hop_length=1024,
               window_name=None, pad_end=False,
               input_data_format='channels_last', output_data_format='channels_last',
               input_shape=input_shape))
model.add(Magnitude())
model.add(MagnitudeToDecibel())  # these three layers can be replaced with get_stft_magnitude_layer()
# Alternatively, you may want to use a melspectrogram layer
# melgram_layer = get_melspectrogram_layer()
# or log-frequency layer
# log_stft_layer = get_log_frequency_spectrogram_layer() 

# add more layers as you want
model.add(Conv2D(32, (3, 3), strides=(2, 2)))
model.add(BatchNormalization())
model.add(ReLU())
model.add(GlobalAveragePooling2D())
model.add(Dense(10))
model.add(Softmax())

# Compile the model
model.compile('adam', 'categorical_crossentropy') # if single-label classification

# train it with raw audio sample inputs
# for example, you may have functions that load your data as below.
x = load_x() # e.g., x.shape = (10000, 6, 44100)
y = load_y() # e.g., y.shape = (10000, 10) if it's 10-class classification
# then..
model.fit(x, y)
# Done!

See the Jupyter notebook at the example folder

Citation

Please cite this paper if you use Kapre for your work.

@inproceedings{choi2017kapre,
  title={Kapre: On-GPU Audio Preprocessing Layers for a Quick Implementation of Deep Neural Network Models with Keras},
  author={Choi, Keunwoo and Joo, Deokjin and Kim, Juho},
  booktitle={Machine Learning for Music Discovery Workshop at 34th International Conference on Machine Learning},
  year={2017},
  organization={ICML}
}

Comments

Migrated functions to tf.keras

This PR addresses #52 by removing the dependency on keras and switching to tensorflow.keras

Proposed version is 0.1.6 because of pull request #56

In particular, #56 keeps the dependency on keras with from keras.utils.conv_utils import conv_output_length

opened by douglas125 27
Spectrogram?

I have an older version of Kapre that has time_frequency.Spectrogram, which is trainable.

However, the new version of Kapre doesn't have Spectrogram anymore. Why?

opened by turian 8
Melspectrogram cant be set 'trainable_fb=False'

Melspectrogram cant be set 'trainable_fb=False',after I set trainable_fb=False,trainable_kernel=False,but seems like it doesnot work.It is still trainable.

opened by zhh6 8
htk=true for mel frequencies

We noticed the current implenetation of the mel_frequencies function (based on Librosa) doesn't include the htk=True option, which is handy when training CNNs because then the frequency scale is fully logarithmic which, in principle, makes more sense for frequency invariant convolutional filters.

What was the motivation for removing this? Any chance it can be added?

opened by justinsalamon 8
Added parallel STFT implementation

Hi!

As I comented in #98, I added a parallel STFT implementation based on the map_fn function following the indications of @zaccharieramzi here.

I've added a use_parallel_stft param (disabled by default) that allows to use this function. I've put this param in as many functions as I can. I also added test cases for every function I can, including an specific test that checks that the result of the tf.signal.stft is equals to the result of the parallel_stft function.

Hope this could serve us well meanwhile tensorflow resolves its issues with the fft implementation.

opened by JPery 7
Amplitude-to-decibel conversion produces different results on different batches

Related to #16, I found another issue that contributes to different prediction results depending on batch size (and the batches themselves). In particular, it occurs when using converting spectrograms to decibels.

https://github.com/keunwoochoi/kapre/blob/master/kapre/backend_keras.py#L17

The maximum is taken over the entire tensor, instead of per example in the batch. This results in different normalization when the examples in a batch are different.
bug

opened by auroracramer 7
Inverse Spectrogram and Mel-Spectrogram Layer?

Namaste!

kapre has become an integral part of all my audio Deep Learning experiments. Powerful! Thanks for providing such a great software!

I was thinking... I guess it would make sense to have layers for inverse spectrogram and inverse mel-spectrogram. Thinking about Autoencoders, this would be even more powerful. I know that reconstructing samples from spectrograms is not the best, but it is possible to a certain degree.

What do you think about that feature request?

Best, Tristan

opened by AI-Guru 7
Hey! The input is too short!

Hi,

I'm encountering an assertion problem when calling your code with a Tensorflow backend.

input_shape = (44100,1)

Could this be a be a problem with "channels_first" / "channels_last"?

Best, Alex

opened by slychief 6
Pip?

It seems you were on pip, but are no longer. Is there anything I could do to help get kapre back on there? We want to use this library in a commercial application, and for our process pip packages are easier to support than a git repository.

opened by ff-rfeather 6

trainable_stft error

Following your example but missing layer definition trainable_stft or something, can you provide example with error resolution?

`# 6 channels (!), maybe 1-sec audio signal
input_shape = (6, 44100) 
sr = 44100
model = Sequential()
model.add(Melspectrogram(n_dft=512, n_hop=256, input_shape=src_shape,
                         border_mode='same', sr=sr, n_mels=128,
                         fmin=0.0, fmax=sr/2, power=1.0,
                         return_decibel=False, trainable_fb=False,
                         trainable_kernel=False
                         name='trainable_stft'))`

  File "<ipython-input-24-cea5588ddf1e>", line 13
    name='trainable_stft'))
       ^
SyntaxError: invalid syntax

opened by sildeag 6

`STFT` layer output shape deviates from `STFTTflite` layer in batch dimension
Use Case

I want to convert a STFT layer in my model to a STFTTflite to deploy it to my mobile device. In the documentation I found that another dimension is added to account for complex numbers. But I also encountered a behaviour that is not documented.

Expected Behaviour

input_shape = (2048, 1) # mono signal model = keras.models.Sequential() # TFLite incompatible model model.add(kapre.STFT(n_fft=1024, hop_length=512, input_shape=input_shape)) tflite_model = keras.models.Sequential() # TFLite compatible model tflite_model.add(kapre.STFTTflite(n_fft=1024, hop_length=512, input_shape=input_shape))

model has the output shape (None, 3, 513, 1). Therefore, tflite_model should have the output shape (None, 3, 513, 1, 2).

Observed Behaviour

The output shape of tflite_model is (1, 3, 513, 1, 2) instead of (None, 3, 513, 1, 2).

Problem Solution

If this behaviour is unwanted:

Change the model output format so that the batch dimension is correctly shaped.

Otherwise:

Explain in the documentation why the batch dimension is shaped to 1.

Explain in the documentation how to include this layer into models which expect the batch dimension to be shaped None.
opened by PhilippMatthes 5

Problem incorporating SpecAugument in the training process

Hi,

I'm trying to add a SpecAug layer in the training process of a CNN using the code below:


CLIP_DURATION = 5 
SAMPLING_RATE = 41000
NUM_CHANNELS = 1

INPUT_SHAPE = ((CLIP_DURATION * SAMPLING_RATE), NUM_CHANNELS)

melgram = get_melspectrogram_layer(input_shape = INPUT_SHAPE,
                          n_fft = 2048,
                          hop_length = 512,
                          return_decibel=True,
                          n_mels = 40,
                          mel_f_min = 500,
                          mel_f_max = 15000,
                          input_data_format='channels_last',
                          output_data_format='channels_last')

spec_augment = SpecAugment(freq_mask_param=5,
                          time_mask_param=10,
                          n_freq_masks=2,
                          n_time_masks=3,
                          mask_value=-100)   

model = Sequential()
model.add(melgram)
model.add(spec_augment)

The CNN summary looks like this:

Model: "sequential_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 melspectrogram (Sequential)  (None, 397, 40, 1)       0         
                                                                 
 spec_augment_1 (SpecAugment  (None, 397, 40, 1)       0         
 )                                                               
                                                                 
=================================================================
Total params: 0
Trainable params: 0
Non-trainable params: 0
_________________________________________________________________

Compiling and fitting the model

model.compile(loss = 'sparse_categorical_crossentropy', optimizer='adam', metrics = 'accuracy')

early_stop = EarlyStopping(monitor='loss', patience=5)

reduce_LR = ReduceLROnPlateau(monitor="val_loss",factor=0.1,patience=4)

checkpointer = ModelCheckpoint(filepath = 'saved_models/bird_song_classification.hdf5')

model.fit(X_train, y_train, validation_data = (X_val, y_val), epochs = 50, batch_size = 32, callbacks = [early_stop, checkpointer, reduce_LR])

Then I get the following error:

Epoch 1/50
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
[<ipython-input-35-e58a056ab523>](https://localhost:8080/#) in <module>
      7 checkpointer = ModelCheckpoint(filepath = 'saved_models/bird_song_classification.hdf5')
      8 
----> 9 model.fit(X_train, y_train, validation_data = (X_val, y_val), epochs = 50, batch_size = 32, callbacks = [early_stop, checkpointer, reduce_LR])

6 frames
[/usr/local/lib/python3.7/dist-packages/kapre/augmentation.py](https://localhost:8080/#) in tf___apply_masks_to_axis(self, x, axis, mask_param, n_masks)
     78                 try:
     79                     do_return = True
---> 80                     retval_ = ag__.converted_call(ag__.ld(tf).where, (ag__.ld(mask), ag__.ld(self).mask_value, ag__.ld(x)), None, fscope)
     81                 except:
     82                     do_return = False

TypeError: in user code:

    File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 1051, in train_function  *
        return step_function(self, iterator)
    File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 1040, in step_function  **
        outputs = model.distribute_strategy.run(run_step, args=(data,))
    File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 1030, in run_step  **
        outputs = model.train_step(data)
    File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 889, in train_step
        y_pred = self(x, training=True)
    File "/usr/local/lib/python3.7/dist-packages/keras/utils/traceback_utils.py", line 67, in error_handler
        raise e.with_traceback(filtered_tb) from None
    File "/tmp/__autograph_generated_filepzvfxhgz.py", line 63, in tf__call
        ag__.if_stmt((ag__.ld(training) in (None, False)), if_body_2, else_body_2, get_state_2, set_state_2, ('do_return', 'retval_'), 2)
    File "/tmp/__autograph_generated_filepzvfxhgz.py", line 58, in else_body_2
        retval_ = ag__.converted_call(ag__.ld(tf).map_fn, (), dict(elems=ag__.ld(x), fn=ag__.ld(self)._apply_spec_augment, dtype=ag__.ld(tf).float32, fn_output_signature=ag__.ld(tf).float32), fscope)
    File "/tmp/__autograph_generated_filef27o6c1f.py", line 44, in tf___apply_spec_augment
        ag__.if_stmt((ag__.ld(self).n_time_masks >= 1), if_body_1, else_body_1, get_state_1, set_state_1, ('x',), 1)
    File "/tmp/__autograph_generated_filef27o6c1f.py", line 39, in if_body_1
        x = ag__.converted_call(ag__.ld(self)._apply_masks_to_axis, (ag__.ld(x),), dict(axis=ag__.ld(time_axis), mask_param=ag__.ld(self).time_mask_param, n_masks=ag__.ld(self).n_time_masks), fscope)
    File "/tmp/__autograph_generated_file3vip8w4x.py", line 80, in tf___apply_masks_to_axis
        retval_ = ag__.converted_call(ag__.ld(tf).where, (ag__.ld(mask), ag__.ld(self).mask_value, ag__.ld(x)), None, fscope)

    TypeError: Exception encountered when calling layer "spec_augment_1" (type SpecAugment).
    
    in user code:
    
        File "/usr/local/lib/python3.7/dist-packages/kapre/augmentation.py", line 299, in call  *
            elems=x, fn=self._apply_spec_augment, dtype=tf.float32, fn_output_signature=tf.float32
        File "/usr/local/lib/python3.7/dist-packages/kapre/augmentation.py", line 273, in _apply_spec_augment  *
            x = self._apply_masks_to_axis(
        File "/usr/local/lib/python3.7/dist-packages/kapre/augmentation.py", line 254, in _apply_masks_to_axis  *
            return tf.where(mask, self.mask_value, x)
    
        TypeError: Input 'e' of 'SelectV2' Op has type float32 that does not match type int32 of argument 't'.
    
    
    Call arguments received by layer "spec_augment_1" (type SpecAugment):
      • x=tf.Tensor(shape=(None, 397, 40, 1), dtype=float32)
      • training=True
      • kwargs=<class 'inspect._empty'>

The shape of X_train is

(2182, 205000, 1)

I'm using Tensorflow 2.9.2, and Python 3.7.15

When I remove the SpecAug layer everything runs fine. I've tested using only the melspec + a mobile net at the end and it runs smooth. The problem is apparently related to SpecAug layer.

Do you have any idea what could be going wrong here? I appreciate any guidance related to the problem. Best regards.

opened by nnbuainain 2

Full-integer quantization and kapre layers
I am training a model which includes the mel-spectrogram block from get_melspectrogram_layer() right after the input layer. Training goes well, and I am able to change the specific mel-spec-layers to their TFLite-counterparts (STFTTflite, MagnitudeTflite) afterwards. I have checked also that the model performs as well as before.

The model also perfoms as expected when converting the model to .tflite using dynamic range quantization. However, when using full-integer quantization, the model loses its accuracy (see (https://www.tensorflow.org/lite/performance/post_training_quantization#integer_only).

I suppose the mel-spec starts to significantly differ as in full-integer quantization, the input values are projected to new range (int8). Is there any way to make it work with full-integer quantization?

I guess I need to separate the mel-spec-layer from the model as a preprocessing step in order to succeed with full-integer quantization, i.e., apply the input quantization to the output values of mel-spec layer. But then I would have to deploy two models to the edge device, where the input goes first to the mel-spec-block and then to the rest of the model (?).

I am using TensorFlow 2.7.0 and kapre 0.3.7.

Here is my code for testing the tflite-model:

preds = [] # Test and evaluate the TFLite-converted model on unseen test data for i, sample in enumerate(X_test_full_scaled): X = sample if input_details['dtype'] == np.int8: input_scale, input_zero_point = input_details["quantization"] X = sample / input_scale + input_zero_point X = X.reshape((1, 8000, 1)).astype(input_details["dtype"]) interpreter.set_tensor(input_index, X) interpreter.invoke() pred = interpreter.get_tensor(output_index) output_scale, output_zero_point = output_details['quantization'] if output_details['dtype'] == np.int8: pred = pred.astype(np.float32) pred = (pred - output_zero_point) * output_scale pred = np.argmax(pred, axis=1)[0] preds.append(pred) preds = np.array(preds)
opened by eppane 3
Calling Magnitude() and Phase() simultaneously

Hi,

I am looking to call Magnitude() and Phase() simultaneously for the same STFT input and concatenate the magnitude and phase before feeding into the convolution layers in my CNN sequential Keras model.

Is this possible?

Best,

Yang

opened by HsuanYang-Wang 1
about kapre.utiils

Hi, when i used "from kapre.utils import Normalization2D", I met this error which said No module named 'kapre.utils'. I see your package, and found that there is surely no utils.py. I am wondering how to slove it.

Best wishes, Daisy

opened by YiningWang2 1
Function missing in updated version

I noticed there is a functon "kapre.utils.Normalization2D" in the old version, while I cannot find it in the updated version. Why? Is there have any alternative functions?

opened by v3551G 1
trainable DSP parameters

hello contributers and community.

I love your repo! It's eases so much for me! Although having the precomputation in the model is already great I'd like to know how you can optimize DSP parameters. It looks like that this is a feature from old versions (e.g. 0.2) and by default I dont see any trainable params in this layer.

Could you please state if this is still available and how to use it?

happy hacking Paul

opened by bytosaur 2

Releases(Kapre-0.3.7)

Kapre-0.3.7(Jan 21, 2022)
Add SpecAugment layer

Source code(tar.gz)
Source code(zip)
Kapre-0.3.6(Nov 14, 2021)
bugfix (tflite)

Source code(tar.gz)
Source code(zip)
Kapre-0.3.5(Mar 18, 2021)
Add tflite-compatible stft layer

Source code(tar.gz)
Source code(zip)
Kapre-0.3.4(Sep 29, 2020)

Bugfix for get_window_fn()
Source code(tar.gz)
Source code(zip)
0.3.3(Sep 15, 2020)
kapre.augmentation is added

kapre.time_frequency.ConcatenateFrequencyMap is added

kapre.composed.get_frequency_aware_conv2d is added

In STFT and InverseSTFT, keyword arg window_fn is renamed to window_name and it expects string value, not function.

With this update, models with Kapre layers can be loaded with h5 file format.

kapre.backend.get_window_fn is added

Source code(tar.gz)
Source code(zip)

0.3.2(Aug 30, 2020)

- `kapre.signal.Frame` and `kapre.signal.Energy` are added
- `kapre.signal.LogmelToMFCC` is added
- `kapre.signal.MuLawEncoder` and `kapre.signal.MuLawDecoder` are added
- `kapre.composed.get_stft_magnitude_layer()` is added 
- doc is hosted at https://kapre.readthedocs.io/

Source code(tar.gz)
Source code(zip)

0.3.1(Aug 21, 2020)
InverseSTFT and etc.

Source code(tar.gz)
Source code(zip)
0.3.0(Aug 16, 2020)

Breaking and simplifying changes with Tensorflow 2.0 and more tests. Some features are removed. New layer - STFT(). New approach for more complicated representations - see kapre.composed.
Source code(tar.gz)
Source code(zip)
v0.1.8(May 18, 2020)

Added Delta layer
Source code(tar.gz)
Source code(zip)
kapre-master.zip(3.16 MB)
v0.1.7(Feb 20, 2020)

Source code(tar.gz)
Source code(zip)

Owner

Keunwoo Choi

MIR, machine learning, music recommendation.

GitHub Repository

Synchronize a local directory of songs' (MP3, MP4) metadata (genre, ratings) and playlists with a Plex server.

PlexMusicSync Synchronize a local directory of songs' (MP3, MP4) metadata (genre, ratings) and playlists (m3u, m3u8) with a Plex server. The song file

9 Jul 07, 2022

F.R.I.D.A.Y. ----- Female Replacement Intelligent Digital Assistant Youth

F.R.I.D.A.Y. Female Replacement Intelligent Digital Assistant Youth--Jarvis-- the virtual assistant made by python Overview This is a virtual assistan

4 Feb 26, 2022

Improved Python UI to convert Youtube URL to .mp3 file.

YT-MP3 Improved Python UI to convert Youtube URL to .mp3 file. How to use? Just run python3 main.py Enter the URL of the video Enter the PATH of where

8 Jun 19, 2022

extract unpack asset file (form unreal engine 4 pak) with extenstion *.uexp which contain awb/acb (cri/cpk like) sound or music resource

Uexp2Awb extract unpack asset file (form unreal engine 4 pak) with extenstion .uexp which contain awb/acb (cri/cpk like) sound or music resource. i ju

6 Jun 22, 2022

Converting UGG files from Rode Wireless Go II transmitters (unsompressed recordings) to WAV format

Rode_WirelessGoII_UGG2wav Converting UGG files from Rode Wireless Go II transmitters (uncompressed recordings) to WAV format Story I backuped the .ugg

31 Dec 22, 2022

The venturimeter works on the principle of Bernoulli's equation, i.e., the pressure decreases as the velocity increases.

The venturimeter works on the principle of Bernoulli's equation, i.e., the pressure decreases as the velocity increases. The cross-section of the throat is less than the cross-section of the inlet pi

1 Dec 03, 2021

Algorithmic Multi-Instrumental MIDI Continuation Implementation

Matchmaker Algorithmic Multi-Instrumental MIDI Continuation Implementation Taming large-scale MIDI datasets with algorithms This is a WIP so please ch

2 Mar 11, 2022

無料で使える中品質なテキスト読み上げソフトウェア、VOICEVOXのコア

0 Aug 29, 2022

A rofi-blocks script that searches youtube and plays the selected audio on mpv.

rofi-ytm A rofi-blocks script that searches youtube and plays the selected audio on mpv. To use the script, run the following command rofi -modi block

26 Dec 21, 2022

TONet: Tone-Octave Network for Singing Melody Extraction from Polyphonic Music

TONet Introduction The official implementation of "TONet: Tone-Octave Network for Singing Melody Extraction from Polyphonic Music", in ICASSP 2022 We

29 Dec 01, 2022

A python library for working with praat, textgrids, time aligned audio transcripts, and audio files.

praatIO Questions? Comments? Feedback? A library for working with praat, time aligned audio transcripts, and audio files that comes with batteries inc

224 Dec 19, 2022

Conferencing Speech Challenge

ConferencingSpeech 2021 challenge This repository contains the datasets list and scripts required for the ConferencingSpeech challenge. For more detai

73 Nov 29, 2022

Pyrogram bot to automate streaming music in voice chats

Pyrogram bot to automate streaming music in voice chats Help If you face an error, want to discuss this project or get support for it, join it's group

124 Oct 21, 2022

A python package for calculating the PESQ.

PyPESQ (WIP) Pypesq is a python wrapper for the PESQ score calculation C routine. It only can be used in evaluation purpose. INSTALL pip install https

269 Dec 18, 2022

PianoPlayer - Automatic fingering generator for piano scores

571 Jan 02, 2023

Code for paper 'Audio-Driven Emotional Video Portraits'.

Audio-Driven Emotional Video Portraits [CVPR2021] Xinya Ji, Zhou Hang, Kaisiyuan Wang, Wayne Wu, Chen Change Loy, Xun Cao, Feng Xu [Project] [Paper] G

197 Dec 31, 2022

Stream Music 🎵 𝘼 𝙗𝙤𝙩 𝙩𝙝𝙖𝙩 𝙘𝙖𝙣 𝙥𝙡𝙖𝙮 𝙢𝙪𝙨𝙞𝙘 𝙤𝙣 𝙏𝙚𝙡𝙚𝙜𝙧𝙖𝙢 𝙂𝙧𝙤𝙪𝙥 𝙖𝙣𝙙 𝘾𝙝𝙖𝙣𝙣𝙚𝙡 𝙑𝙤𝙞𝙘𝙚 𝘾𝙝𝙖𝙩𝙨 𝘼𝙫𝙖𝙞𝙡?

Stream Music 🎵 𝘼 𝙗𝙤𝙩 𝙩𝙝𝙖𝙩 𝙘𝙖𝙣 𝙥𝙡𝙖𝙮 𝙢𝙪𝙨𝙞𝙘 𝙤𝙣 𝙏𝙚𝙡𝙚𝙜𝙧𝙖𝙢 𝙂𝙧𝙤𝙪𝙥 𝙖𝙣𝙙 𝘾𝙝𝙖𝙣𝙣𝙚𝙡 𝙑𝙤𝙞𝙘𝙚 𝘾𝙝𝙖𝙩𝙨 𝘼𝙫𝙖𝙞𝙡?

15 Nov 12, 2022

This Is Telegram Music UserBot To Play Music Without Being Admin

36 Sep 13, 2022

A Python wrapper for the high-quality vocoder "World"

PyWORLD - A Python wrapper of WORLD Vocoder Linux Windows WORLD Vocoder is a fast and high-quality vocoder which parameterizes speech into three compo

583 Dec 15, 2022

Read music meta data and length of MP3, OGG, OPUS, MP4, M4A, FLAC, WMA and Wave files with python 2 or 3

tinytag tinytag is a library for reading music meta data of MP3, OGG, OPUS, MP4, M4A, FLAC, WMA and Wave files with python Install pip install tinytag

577 Dec 26, 2022

kapre: Keras Audio Preprocessors

Related tags

Overview

Kapre

Why Kapre?

vs. Pre-computation

vs. Your own implementation

Workflow with Kapre

Installation

API Documentation

One-shot example

Citation

Comments

Use Case

Expected Behaviour

Observed Behaviour

Problem Solution

Releases(Kapre-0.3.7)

Kapre-0.3.7(Jan 21, 2022)

Kapre-0.3.6(Nov 14, 2021)

Kapre-0.3.5(Mar 18, 2021)

Kapre-0.3.4(Sep 29, 2020)

0.3.3(Sep 15, 2020)

0.3.2(Aug 30, 2020)

0.3.1(Aug 21, 2020)

0.3.0(Aug 16, 2020)

v0.1.8(May 18, 2020)

v0.1.7(Feb 20, 2020)

Owner

Keunwoo Choi

Synchronize a local directory of songs' (MP3, MP4) metadata (genre, ratings) and playlists with a Plex server.

F.R.I.D.A.Y. ----- Female Replacement Intelligent Digital Assistant Youth

Improved Python UI to convert Youtube URL to .mp3 file.

extract unpack asset file (form unreal engine 4 pak) with extenstion *.uexp which contain awb/acb (cri/cpk like) sound or music resource

Converting UGG files from Rode Wireless Go II transmitters (unsompressed recordings) to WAV format

The venturimeter works on the principle of Bernoulli's equation, i.e., the pressure decreases as the velocity increases.

Algorithmic Multi-Instrumental MIDI Continuation Implementation

無料で使える中品質なテキスト読み上げソフトウェア、VOICEVOXのコア

A rofi-blocks script that searches youtube and plays the selected audio on mpv.

TONet: Tone-Octave Network for Singing Melody Extraction from Polyphonic Music

A python library for working with praat, textgrids, time aligned audio transcripts, and audio files.

Conferencing Speech Challenge

Pyrogram bot to automate streaming music in voice chats

A python package for calculating the PESQ.

PianoPlayer - Automatic fingering generator for piano scores

Code for paper 'Audio-Driven Emotional Video Portraits'.

Stream Music 🎵 𝘼 𝙗𝙤𝙩 𝙩𝙝𝙖𝙩 𝙘𝙖𝙣 𝙥𝙡𝙖𝙮 𝙢𝙪𝙨𝙞𝙘 𝙤𝙣 𝙏𝙚𝙡𝙚𝙜𝙧𝙖𝙢 𝙂𝙧𝙤𝙪𝙥 𝙖𝙣𝙙 𝘾𝙝𝙖𝙣𝙣𝙚𝙡 𝙑𝙤𝙞𝙘𝙚 𝘾𝙝𝙖𝙩𝙨 𝘼𝙫𝙖𝙞𝙡?

This Is Telegram Music UserBot To Play Music Without Being Admin

A Python wrapper for the high-quality vocoder "World"

Read music meta data and length of MP3, OGG, OPUS, MP4, M4A, FLAC, WMA and Wave files with python 2 or 3