MIDI-DDSP: Detailed Control of Musical Performance via Hierarchical Modeling

Related tags

Audiomidi-ddsp
Overview
logo

MIDI-DDSP: Detailed Control of Musical Performance via Hierarchical Modeling

Demos | Blog Post | Colab Notebook | Paper | Hugging Face Spaces

MIDI-DDSP is a hierarchical audio generation model for synthesizing MIDI expanded from DDSP.

Links

Install MIDI-DDSP

You could install MIDI-DDSP via pip, which allows you to use the cool Command-line MIDI synthesis to synthesize your MIDI.

To install MIDI-DDSP via pip, simply run:

pip install midi-ddsp

Train MIDI-DDSP

To train MIDI-DDSP, please first install midi-ddsp and clone the MIDI-DDSP repository:

git clone https://github.com/magenta/midi-ddsp.git

For dataset, please download the tfrecord files for the URMP dataset in here to the data folder in your cloned repository using the following commands:

cd midi-ddsp # enter the project directory
mkdir ./data # create a data folder
gsutil cp gs://magentadata/datasets/urmp/urmp_20210324/* ./data # download tfrecords to directory

Please check here for how to install and use gsutil.

Finally, you can run the script train_midi_ddsp.sh to train the exact same model we used in the paper:

sh ./train_midi_ddsp.sh

The current codebase does not support training with arbitrary dataset, but we will hopefully update that in the near future.

Side note:

If one download the dataset to a different location, please change the data_dir parameter in train_midi_ddsp.sh.

The training of MIDI-DDSP takes approximately 18 hours on a single RTX 8000. The training code for now does not support multi-GPU training. We recommend using a GPU with more than 24G of memory when training Synthesis Generator in batch size of 16. For a GPU with less memory, please consider using a smaller batch size and change the batch size in train_midi_ddsp.sh.

Try to play with MIDI-DDSP yourself!

Please try out MIDI-DDSP in Colab notebooks!

In this notebook, you will try to use MIDI-DDSP to synthesis a monophonic MIDI file, adjust note expressions, make pitch bend by adjusting synthesis parameters, and synthesize quartet from Bach chorales.

We have trained MIDI-DDSP on the URMP dataset which support synthesizing 13 instruments: violin, viola, cello, double bass, flute, oboe, clarinet, saxophone, bassoon, trumpet, horn, trombone, tuba. You could find how to download and use our pre-trained model below:

Command-line MIDI synthesis

On can use the MIDI-DDSP as a command-line MIDI synthesizer just like FluidSynth.

To use command-line synthesis to synthesize a midi file, please first download the model weights by running:

midi_ddsp_download_model_weights

To synthesize a midi file simply run the following command:

midi_ddsp_synthesize --midi_path <path-to-midi>

For a starter, you can try to synthesize the example midi file in this repository:

midi_ddsp_synthesize --midi_path ./midi_example/ode_to_joy.mid

The command line also enables synthesize a folder of midi files. For more advance use (synthesize a folder, using FluidSynth for instruments not supported, etc.), please see synthesize_midi.py --help.

If you have a trouble downloading the model weights, please manually download from here, and specify the synthesis_generator_weight_path and expression_generator_weight_path by yourself when using the command line. You can also specify your other model weights if you want to use your own trained model.

Python Usage

After installing midi-ddsp, you could import midi-ddsp in python and synthesize MIDI in your code.

Minimal Example

Here is a simple example to use MIDI-DDSP to synthesize a midi file:

from midi_ddsp import synthesize_midi, load_pretrained_model

midi_file = 'ode_to_joy.mid'
# Load pre-trained model
synthesis_generator, expression_generator = load_pretrained_model()
# Synthesize MIDI
output = synthesize_midi(synthesis_generator, expression_generator, midi_file)
# The synthesized audio
synthesized_audio = output['mix_audio']

Advance Usage

Here is an advance example to synthesize the ode_to_joy.mid, change the note expression controls, and adjust the synthesis parameters:

import numpy as np
import tensorflow as tf
from midi_ddsp.utils.midi_synthesis_utils import synthesize_mono_midi, conditioning_df_to_audio
from midi_ddsp.utils.inference_utils import get_process_group
from midi_ddsp.midi_ddsp_synthesize import load_pretrained_model
from midi_ddsp.data_handling.instrument_name_utils import INST_NAME_TO_ID_DICT

# -----MIDI Synthesis-----
midi_file = 'ode_to_joy.mid'
# Load pre-trained model
synthesis_generator, expression_generator = load_pretrained_model()
# Synthesize with violin:
instrument_name = 'violin'
instrument_id = INST_NAME_TO_ID_DICT[instrument_name]
# Run model prediction
midi_audio, midi_control_params, midi_synth_params, conditioning_df = synthesize_mono_midi(synthesis_generator,
                                                                                           expression_generator,
                                                                                           midi_file, instrument_id,
                                                                                           output_dir=None)

synthesized_audio = midi_audio  # The synthesized audio

# -----Adjust note expression controls and re-synthesize-----

# Make all notes weak vibrato:
conditioning_df_changed = conditioning_df.copy()
note_vibrato = conditioning_df_changed['vibrato_extend'].value
conditioning_df_changed['vibrato_extend'] = np.ones_like(conditioning_df['vibrato_extend'].values) * 0.1
# Re-synthesize
midi_audio_changed, midi_control_params_changed, midi_synth_params_changed = conditioning_df_to_audio(
  synthesis_generator, conditioning_df_changed, tf.constant([instrument_id]))

synthesized_audio_changed = midi_audio_changed  # The synthesized audio

# There are 6 note expression controls in conditioning_df that you could change:
# 'amplitude_mean', 'amplitude_std', 'vibrato_extend', 'brightness', 'attack_level', 'amplitudes_max_pos'.
# Please refer to https://colab.research.google.com/github/magenta/midi-ddsp/blob/main/midi_ddsp/colab/MIDI_DDSP_Demo.ipynb#scrollTo=XfPPrdPu5sSy for the effect of each control. 

# -----Adjust synthesis parameters and re-synthesize-----

# The original synthesis parameters:
f0_ori = midi_synth_params['f0_hz']
amps_ori = midi_synth_params['amplitudes']
noise_ori = midi_synth_params['noise_magnitudes']
hd_ori = midi_synth_params['harmonic_distribution']

# TODO: make your change of the synthesis parameters here:
f0_changed = f0_ori
amps_changed = amps_ori
noise_changed = noise_ori
hd_changed = hd_ori

# Resynthesis the audio using DDSP
processor_group = get_process_group(midi_synth_params['amplitudes'].shape[1], use_angular_cumsum=True)
midi_audio_changed = processor_group({'amplitudes': amps_changed,
                                      'harmonic_distribution': hd_changed,
                                      'noise_magnitudes': noise_changed,
                                      'f0_hz': f0_changed, },
                                     verbose=False)
midi_audio_changed = synthesis_generator.reverb_module(midi_audio_changed, reverb_number=instrument_id, training=False)

synthesized_audio_changed = midi_audio_changed  # The synthesized audio
Comments
  • ImportError and AttributeError

    ImportError and AttributeError

    Hi! Very interesting work! I’m trying to run MIDI_DDSP_Demo.ipynb, and I encountered some errors.

    1. ImportError: cannot import name 'LD_RANGE' occurred in from ddsp.spectral_ops import F0_RANGE, LD_RANGE(see here). -> According to DDSP, I think 'DB_RANGE' is correct, not 'LD_RANGE' .

    2. AttributeError: module 'ddsp.spectral_ops' has no attribute 'amplitude_to_db' occurred where ddsp.spectral_ops.amplitude_to_db is used (see here). -> According to DDSP, I suppose it is 'ddsp.core', not 'ddsp.spectral_ops'.

    opened by MasayaKawamura 3
  • Question on the installation

    Question on the installation

    Hi,

    When I opened up a new colab notebook and tried to pip install the midi-ddsp package, it took over 40 minutes and the installation can not be completed. I didn't experience this until this week.

    I've tried pip install midi-ddsp and pip install git+https://github.com/magenta/midi-ddsp and both gave me the same results.

    It seemed like pip would spend a lot of time trying to find which version of etils was compatible.

    opened by tiianhk 2
  • Error when using

    Error when using "pip install midi-ddsp"

    Hello everyone,

    I'm trying to use midi-ddsp to synthesize a few .midi files. In order to achieve it, I create a virtual environment with:

    python3 -m venv .venv source .venv/bin/activate

    note: python version == 3.10.6

    After creating it I run: "pip install midi-ddsp" . Getting this error message:

    Collecting ddsp Using cached ddsp-1.9.0-py2.py3-none-any.whl (200 kB) Using cached ddsp-1.7.1-py2.py3-none-any.whl (199 kB) Using cached ddsp-1.7.0-py2.py3-none-any.whl (197 kB) Using cached ddsp-1.6.5-py2.py3-none-any.whl (194 kB) Using cached ddsp-1.6.3-py2.py3-none-any.whl (194 kB) Using cached ddsp-1.6.2-py2.py3-none-any.whl (194 kB) Using cached ddsp-1.6.0-py2.py3-none-any.whl (194 kB) Using cached ddsp-1.4.0-py2.py3-none-any.whl (192 kB) Using cached ddsp-1.3.1-py2.py3-none-any.whl (192 kB) Using cached ddsp-1.3.0-py2.py3-none-any.whl (183 kB) Using cached ddsp-1.2.0-py2.py3-none-any.whl (179 kB) Using cached ddsp-1.1.0-py2.py3-none-any.whl (175 kB) Using cached ddsp-1.0.1-py2.py3-none-any.whl (170 kB) Using cached ddsp-1.0.0-py2.py3-none-any.whl (168 kB) Using cached ddsp-0.14.0-py2.py3-none-any.whl (143 kB) Using cached ddsp-0.13.1-py2.py3-none-any.whl (129 kB) Using cached ddsp-0.13.0-py2.py3-none-any.whl (129 kB) Using cached ddsp-0.12.0-py2.py3-none-any.whl (127 kB) Using cached ddsp-0.10.0-py2.py3-none-any.whl (109 kB) Using cached ddsp-0.9.0-py2.py3-none-any.whl (109 kB) Using cached ddsp-0.8.0-py2.py3-none-any.whl (108 kB) Using cached ddsp-0.7.0-py2.py3-none-any.whl (107 kB) Using cached ddsp-0.5.1-py2.py3-none-any.whl (101 kB) Using cached ddsp-0.5.0-py2.py3-none-any.whl (101 kB) Using cached ddsp-0.4.0-py2.py3-none-any.whl (97 kB) Using cached ddsp-0.2.4-py2.py3-none-any.whl (89 kB) Using cached ddsp-0.2.3-py2.py3-none-any.whl (89 kB) Using cached ddsp-0.2.2-py2.py3-none-any.whl (89 kB) Using cached ddsp-0.2.0-py2.py3-none-any.whl (88 kB) Using cached ddsp-0.1.0-py3-none-any.whl (88 kB) Using cached ddsp-0.0.10-py3-none-any.whl (88 kB) Using cached ddsp-0.0.9-py3-none-any.whl (86 kB) Using cached ddsp-0.0.8-py3-none-any.whl (86 kB) Using cached ddsp-0.0.7-py3-none-any.whl (85 kB) Using cached ddsp-0.0.6-py2.py3-none-any.whl (91 kB) Using cached ddsp-0.0.5-py2.py3-none-any.whl (91 kB) Using cached ddsp-0.0.4-py2.py3-none-any.whl (83 kB) Using cached ddsp-0.0.3-py2.py3-none-any.whl (81 kB) Using cached ddsp-0.0.1-py2.py3-none-any.whl (75 kB) Using cached ddsp-0.0.0-py2.py3-none-any.whl (75 kB) INFO: pip is looking at multiple versions of midi-ddsp to determine which version is compatible with other requirements. This could take a while. Collecting midi-ddsp Using cached midi_ddsp-0.1.3-py3-none-any.whl (56 kB) Using cached midi_ddsp-0.1.1-py3-none-any.whl (56 kB) Using cached midi_ddsp-0.1.0-py3-none-any.whl (53 kB) ERROR: Cannot install midi-ddsp because these package versions have conflicting dependencies.

    The conflict is caused by: ddsp 3.4.4 depends on tensorflow ddsp 3.4.3 depends on tensorflow ddsp 3.4.1 depends on tensorflow ddsp 3.4.0 depends on tensorflow ddsp 3.3.6 depends on tensorflow ddsp 3.3.4 depends on tensorflow ddsp 3.3.2 depends on tensorflow ddsp 3.3.0 depends on tensorflow ddsp 3.2.1 depends on tensorflow ddsp 3.2.0 depends on tensorflow ddsp 3.1.0 depends on tensorflow ddsp 1.9.0 depends on tensorflow ddsp 1.7.1 depends on tensorflow ddsp 1.7.0 depends on tensorflow ddsp 1.6.5 depends on tensorflow ddsp 1.6.3 depends on tensorflow ddsp 1.6.2 depends on tensorflow ddsp 1.6.0 depends on tensorflow ddsp 1.4.0 depends on tensorflow ddsp 1.3.1 depends on tensorflow ddsp 1.3.0 depends on tensorflow ddsp 1.2.0 depends on tensorflow ddsp 1.1.0 depends on tensorflow ddsp 1.0.1 depends on tensorflow ddsp 1.0.0 depends on tensorflow ddsp 0.14.0 depends on tensorflow ddsp 0.13.1 depends on tensorflow ddsp 0.13.0 depends on tensorflow ddsp 0.12.0 depends on tensorflow ddsp 0.10.0 depends on tensorflow ddsp 0.9.0 depends on tensorflow ddsp 0.8.0 depends on tensorflow ddsp 0.7.0 depends on tensorflow ddsp 0.5.1 depends on tensorflow ddsp 0.5.0 depends on tensorflow ddsp 0.4.0 depends on tensorflow ddsp 0.2.4 depends on tensorflow ddsp 0.2.3 depends on tensorflow ddsp 0.2.2 depends on tensorflow ddsp 0.2.0 depends on tensorflow ddsp 0.1.0 depends on tensorflow ddsp 0.0.10 depends on tensorflow ddsp 0.0.9 depends on tensorflow ddsp 0.0.8 depends on tensorflow ddsp 0.0.7 depends on tensorflow ddsp 0.0.6 depends on tensorflow ddsp 0.0.5 depends on tensorflow ddsp 0.0.4 depends on tensorflow ddsp 0.0.3 depends on tensorflow ddsp 0.0.1 depends on tensorflow ddsp 0.0.0 depends on tensorflow>=2.1.0

    To fix this you could try to:

    1. loosen the range of package versions you've specified
    2. remove package versions to allow pip attempt to solve the dependency conflict

    ERROR: Resolution Impossible: for help visit https://pip.pypa.io/en/latest/topics/dependency-resolution/#dealing-with-dependency-conflicts

    ======== I am currently working from an M1 mac so I would be pleased if you could help me reaching a solution for this problem.

    Thank you in advance, Juan Carlos

    opened by JuanCarlosMartinezSevilla 2
  • Why is vibrato_rate not used?

    Why is vibrato_rate not used?

    In my opinion, vibrato_rate (peak frequency) is more plausible than vibrato_extend (peak amplitude) to represent pitch pulsating. Why is vibrato_rate not used?

    opened by bfs18 2
  • Docker image with all the necessary packages

    Docker image with all the necessary packages

    Hi again!

    I'm still trying to execute your code with the "pip install midi-ddsp".

    I'm using docker from a tensorflow/tensorflow:latest image and running the pip command. I get this error:

    "E tensorflow/stream_executor/cuda/cuda_blas.cc:2981] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered"

    I ask you guys in order to know if you work with docker... if I could be able to have access to the image you use so all versions are the same and to avoid these type of errors.

    Thank you, Best regards, Juan Carlos

    opened by JuanCarlosMartinezSevilla 0
  • How to distinguish whether it is cresc or decresc in fluctuation?

    How to distinguish whether it is cresc or decresc in fluctuation?

    Hi! I am interested in your project. Currently, I am doing similar stuff, but i have some questions. (1) I know that Fluctuation stands for how much does that note cresc or decresc. Peak means which position got the maximum energy. However, i am wondering how to know if that note is cresc or decresc. Let's assume our peak position is 0.4, and fluc is 0.3. Then, is it decrease 0.3 or increase 0.3? How to distinguish it? (2) I know that all the expressive control values are normalized between 0 and 1, but what's the unit measure in these expressive controls?

    Thanks for answering

    opened by pillow8781 2
  • What is ``input`` in the def call()?

    What is ``input`` in the def call()?

    Hi, I am looking inside the code. I've seen a lot of methods about def call(self, inputs) in your code, especially looking at this one.

      def call(self, inputs):
        synth_params = self.get_synth_params(inputs)
    

    However, I couldn't find out what's the calculation of inputs, there are some clues I've found. In those codes, inputs is respond to the data in get_fake_data_synthesis_generator, then what are the data and units you input to get_fake_data_synthesis_generator? Frames? Amplitude or anything else?

    Thanks!

    opened by Megan8821 4
  • How to solve the error when installing midi-ddsp?

    How to solve the error when installing midi-ddsp?

    Hi, I am new in training model, and I got some problems in here. Does anyone how to solve the error of error: subprocess-exited-with -error and error: metadata-generation-failed in the terminal?

    opened by Megan8821 4
  • tfrecord features

    tfrecord features

    Hello, I'm very interested in your amazing work. Took a deep look at the urmp tfrecord datasets, I found something a bit confusing for me. Could you be so kind to help?

    1. I checked some urmp_tfrecords data, I found that some of them contain the following features: {"audio", "f0_confidence", "f0_hz", "f0_time", "id", "instrument_id", "loudness_db", "note_active_frame_indices", "note_active_velocities", "note_offsets", "note_onsets", "orig_f0_hz", "orig_f0_time", "power_db", "recording_id", "sequence"}. However, some of them don't include {"orig_f0_hz", "orig_f0_time"} in their tfrecord data. Why is this so and does such an inconsistency influence the model training?
    2. I want to include piano music when I train my own model. To this end, I think I need to generate tfrecords that have the same content as the urmp ones you used in your model. I plan to use maestro dataset. Could you be so kind to indicate if there's a tfrecord data generation code that we can take as a reference? Like the one you used to generate tfrecords for the midi-ddsp model?
    3. What is the difference between "batched" and "unbatched" dataset?

    Thank you very much for your help in advance.

    opened by gladys0313 1
  • Can i test with custom data?

    Can i test with custom data?

    Hi, i am interested in this exciting project and i am trying to test this with our custom dataset and reproduce the format of original data. But there are some difficulties and questions below.

    1. Is there no way to use custom datasets at all?
    1. Is there any code to calculate elements of dataset below?
    • I want to know how to get "note_active_velocities", "note_active_frame_indices", "power_db", "note_onsets", "note_offsets" but there is no any code on repository.

    Thank you for reading!

    opened by 589hero 6
  • Question on Figure

    Question on Figure

    image

    quick question on this figure in the blog post: i know coconet is its own model that will generate subsequent melodies given the input midi file. however, should i decide to train midi ddsp, will the training of coconet also be a part of this? or should i expect a monophonic midi melody as input and the generated audio as output.

    thanks for all the help and this awesome project

    opened by theadamsabra 7
Releases(v0.2.5)
Owner
Magenta
An open source research project exploring the role of machine learning as a tool in the creative process.
Magenta
:notes: Cross-platform music player

Exaile Exaile is a music player with a simple interface and powerful music management capabilities. Features include automatic fetching of album art,

Exaile 327 Dec 19, 2022
A python program to cut longer MP3 files (i.e. recordings of several songs) into the individual tracks.

I'm writing a python script to cut longer MP3 files (i.e. recordings of several songs) into the individual tracks called ReCut. So far there are two

Dönerspiess 1 Oct 27, 2021
Datamoshing with FFmpeg

ffmosher Datamoshing with FFmpeg Drag and drop video onto mosh.bat to create a datamoshed video. To datamosh an image, please ensure the file is in a

18 Sep 11, 2022
Sound-Equalizer- This is a Sound Equalizer GUI App Using Python's PyQt5

Sound-Equalizer- This is a Sound Equalizer GUI App Using Python's PyQt5. It gives you the ability to play, pause, and Equalize any one-channel wav audio file and play 3 different instruments.

Mustafa Megahed 1 Jan 10, 2022
Python implementation of the Short Term Objective Intelligibility measure

Python implementation of STOI Implementation of the classical and extended Short Term Objective Intelligibility measures Intelligibility measure which

Pariente Manuel 250 Dec 21, 2022
This is a python package that turns any images into MIDI files that views the same as them

image_to_midi This is a python package that turns any images into MIDI files that views the same as them. This package firstly convert the image to AS

Rainbow Dreamer 4 Mar 10, 2022
A Python 3 script for capturing and recording a SDR stream to a WAV file (or serving it to a HTTP audio stream).

rfsoapyfile A Python 3 script for capturing and recording a SDR stream to a WAV file (or serving it to a HTTP audio stream). The script is threaded fo

4 Dec 19, 2022
Open-Source Tools & Data for Music Source Separation: A Pragmatic Guide for the MIR Practitioner

Open-Source Tools & Data for Music Source Separation: A Pragmatic Guide for the MIR Practitioner

IELab@ Korea University 0 Nov 12, 2021
Python tools for the corpus analysis of popular music.

CATCHY Corpus Analysis Tools for Computational Hook discovery Python tools for the corpus analysis of popular music recordings. The tools can be used

Jan VB 20 Aug 20, 2022
A voice assistant which can handle your everyday task and allows you to book items from your favourite store!

Voicely Table of Contents About The Project Built With Getting Started Prerequisites Installation Usage Roadmap Contributing License Contact Acknowled

Awantika Nigam 2 Nov 17, 2021
The official repository for Audio ALBERT

AALBERT Here is also the official repository of AALBERT, which is Pytorch lightning reimplementation of the paper, Audio ALBERT: A Lite Bert for Self-

pohan 55 Dec 11, 2022
Some utils for auto speech recognition

About Some utils for auto speech recognition. Utils Util Description Script Reset audio Reset sample rate, sample width, etc of audios.

1 Jan 24, 2022
Python module for handling audio metadata

Mutagen is a Python module to handle audio metadata. It supports ASF, FLAC, MP4, Monkey's Audio, MP3, Musepack, Ogg Opus, Ogg FLAC, Ogg Speex, Ogg The

Quod Libet 1.1k Dec 31, 2022
digital audio workstation, instrument and effect plugins, wave editor

digital audio workstation, instrument and effect plugins, wave editor

306 Jan 05, 2023
A python program for visualizing MIDI files, and displaying them in a spiral layout

SpiralMusic_python A python program for visualizing MIDI files, and displaying them in a spiral layout For a hardware version using Teensy & LED displ

Gavin 6 Nov 23, 2022
Multi-Track Music Generation with the Transfomer and the Johann Sebastian Bach Chorales dataset

MMM: Exploring Conditional Multi-Track Music Generation with the Transformer and the Johann Sebastian Bach Chorales Dataset. Implementation of the pap

102 Dec 08, 2022
A collection of python scripts for extracting and analyzing acoustics from audio files.

pyAcoustics A collection of python scripts for extracting and analyzing acoustics from audio files. Contents 1 Common Use Cases 2 Major revisions 3 Fe

Tim 74 Dec 26, 2022
Audio2midi - Automatic Audio-to-symbolic Arrangement

Automatic Audio-to-symbolic Arrangement This is the repository of the project "Audio-to-symbolic Arrangement via Cross-modal Music Representation Lear

Ziyu Wang 24 Dec 05, 2022
Code for paper 'Audio-Driven Emotional Video Portraits'.

Audio-Driven Emotional Video Portraits [CVPR2021] Xinya Ji, Zhou Hang, Kaisiyuan Wang, Wayne Wu, Chen Change Loy, Xun Cao, Feng Xu [Project] [Paper] G

197 Dec 31, 2022
Inner ear models for Python

cochlea cochlea is a collection of inner ear models. All models are easily accessible as Python functions. They take sound signal as input and return

98 Jan 05, 2023