Symbolic Music Generation with Diffusion Models

Last update: Jan 07, 2023

Related tags

Deep Learning symbolic-music-diffusion

Overview

Symbolic Music Generation with Diffusion Models

Supplementary code release for our work Symbolic Music Generation with Diffusion Models.

Installation

All code is written in Python 3 (Anaconda recommended). To install the dependencies:

pip install -r requirements.txt

A copy of the Magenta codebase is required for access to MusicVAE and related components. Installation instructions can be found on the Magenta public repository. You will also need to download pretrained MusicVAE checkpoints. For our experiments, we use the 2-bar melody model.

Datasets

We use the Lakh MIDI Dataset to train our models. Follow these instructions to download and build the Lakh MIDI Dataset.

To encode the Lakh dataset with MusicVAE, use scripts/generate_song_data_beam.py:

python scripts/generate_song_data_beam.py \
  --checkpoint=/path/to/musicvae-ckpt \
  --input=/path/to/lakh_tfrecords \
  --output=/path/to/encoded_tfrecords

To preprocess and generate fixed-length latent sequences for training diffusion and autoregressive models, refer to scripts/transform_encoded_data.py:

python scripts/transform_encoded_data.py \
  --encoded_data=/path/to/encoded_tfrecords \
  --output_path =/path/to/preprocess_tfrecords \
  --mode=sequences \
  --context_length=32

Training

Diffusion

python train_ncsn.py --flagfile=configs/ddpm-mel-32seq-512.cfg

TransformerMDN

python train_mdn.py --flagfile=configs/mdn-mel-32seq-512.cfg

Sampling and Generation

Diffusion

python sample_ncsn.py \
  --flagfile=configs/ddpm-mel-32seq-512.cfg \
  --sample_seed=42 \
  --sample_size=1000 \
  --sampling_dir=/path/to/latent-samples

TransformerMDN

python sample_ncsn.py \
  --flagfile=configs/mdn-mel-32seq-512.cfg \
  --sample_seed=42 \
  --sample_size=1000 \
  --sampling_dir=/path/to/latent-samples

Decoding sequences

To convert sequences of embeddings (generated by diffusion or TransformerMDN models) to sequences of MIDI events, refer to scripts/sample_audio.py.

python scripts/sample_audio.py
  --input=/path/to/latent-samples/[ncsn|mdn] \
  --output=/path/to/audio-midi \
  --n_synth=1000 \
  --include_wav=True

Citing

If you use this code please cite it as:

@inproceedings{
  mittal2021symbolicdiffusion,
  title={Symbolic Music Generation with Diffusion Models},
  author={Gautam Mittal and Jesse Engel and Curtis Hawthorne and Ian Simon},
  booktitle={Proceedings of the 22nd International Society for Music Information Retrieval Conference},
  year={2021},
  url={https://archives.ismir.net/ismir2021/paper/000058.pdf}
}

Note

This is not an official Google product.

Symbolic Music Generation with Diffusion Models

Related tags

Overview

Symbolic Music Generation with Diffusion Models

Installation

Datasets

Training

Diffusion

TransformerMDN

Sampling and Generation

Diffusion

TransformerMDN

Decoding sequences

Citing

Note

Owner

Magenta

A python/pytorch utility library

Tensorflow implementation for "Improved Transformer for High-Resolution GANs" (NeurIPS 2021).

Code for Fully Context-Aware Image Inpainting with a Learned Semantic Pyramid

Udacity's CS101: Intro to Computer Science - Building a Search Engine

The code for our paper "AutoSF: Searching Scoring Functions for Knowledge Graph Embedding"

Official implementation of the paper Vision Transformer with Progressive Sampling, ICCV 2021.

The modify PyTorch version of Siam-trackers which are speed-up by TensorRT.

Differentiable simulation for system identification and visuomotor control

MPViT:Multi-Path Vision Transformer for Dense Prediction

Programming with Neural Surrogates of Programs

C3DPO - Canonical 3D Pose Networks for Non-rigid Structure From Motion.

Simulation of self-focusing of laser beams in condensed media

PassAPI is a password generator in hash format and fully developed in Python, with the aim of teaching how to handle and build

Code for paper "Vocabulary Learning via Optimal Transport for Neural Machine Translation"

Official PyTorch implementation of Joint Object Detection and Multi-Object Tracking with Graph Neural Networks

Fake News Detection Using Machine Learning Methods

PyTorch implementations for our SIGGRAPH 2021 paper: Editable Free-viewpoint Video Using a Layered Neural Representation.

Chinese Advertisement Board Identification(Pytorch)

TensorFlow implementation of Deep Reinforcement Learning papers

code for our ECCV-2020 paper: Self-supervised Video Representation Learning by Pace Prediction