Symbolic Music Generation with Diffusion Models

Overview

Symbolic Music Generation with Diffusion Models

Supplementary code release for our work Symbolic Music Generation with Diffusion Models.

Installation

All code is written in Python 3 (Anaconda recommended). To install the dependencies:

pip install -r requirements.txt

A copy of the Magenta codebase is required for access to MusicVAE and related components. Installation instructions can be found on the Magenta public repository. You will also need to download pretrained MusicVAE checkpoints. For our experiments, we use the 2-bar melody model.

Datasets

We use the Lakh MIDI Dataset to train our models. Follow these instructions to download and build the Lakh MIDI Dataset.

To encode the Lakh dataset with MusicVAE, use scripts/generate_song_data_beam.py:

python scripts/generate_song_data_beam.py \
  --checkpoint=/path/to/musicvae-ckpt \
  --input=/path/to/lakh_tfrecords \
  --output=/path/to/encoded_tfrecords

To preprocess and generate fixed-length latent sequences for training diffusion and autoregressive models, refer to scripts/transform_encoded_data.py:

python scripts/transform_encoded_data.py \
  --encoded_data=/path/to/encoded_tfrecords \
  --output_path =/path/to/preprocess_tfrecords \
  --mode=sequences \
  --context_length=32

Training

Diffusion

python train_ncsn.py --flagfile=configs/ddpm-mel-32seq-512.cfg

TransformerMDN

python train_mdn.py --flagfile=configs/mdn-mel-32seq-512.cfg

Sampling and Generation

Diffusion

python sample_ncsn.py \
  --flagfile=configs/ddpm-mel-32seq-512.cfg \
  --sample_seed=42 \
  --sample_size=1000 \
  --sampling_dir=/path/to/latent-samples 

TransformerMDN

python sample_ncsn.py \
  --flagfile=configs/mdn-mel-32seq-512.cfg \
  --sample_seed=42 \
  --sample_size=1000 \
  --sampling_dir=/path/to/latent-samples 

Decoding sequences

To convert sequences of embeddings (generated by diffusion or TransformerMDN models) to sequences of MIDI events, refer to scripts/sample_audio.py.

python scripts/sample_audio.py
  --input=/path/to/latent-samples/[ncsn|mdn] \
  --output=/path/to/audio-midi \
  --n_synth=1000 \
  --include_wav=True

Citing

If you use this code please cite it as:

@inproceedings{
  mittal2021symbolicdiffusion,
  title={Symbolic Music Generation with Diffusion Models},
  author={Gautam Mittal and Jesse Engel and Curtis Hawthorne and Ian Simon},
  booktitle={Proceedings of the 22nd International Society for Music Information Retrieval Conference},
  year={2021},
  url={https://archives.ismir.net/ismir2021/paper/000058.pdf}
}

Note

This is not an official Google product.

Owner
Magenta
An open source research project exploring the role of machine learning as a tool in the creative process.
Magenta
A package to predict protein inter-residue geometries from sequence data

trRosetta This package is a part of trRosetta protein structure prediction protocol developed in: Improved protein structure prediction using predicte

Ivan Anishchenko 185 Jan 07, 2023
Official Pytorch Implementation of 'Learning Action Completeness from Points for Weakly-supervised Temporal Action Localization' (ICCV-21 Oral)

Learning-Action-Completeness-from-Points Official Pytorch Implementation of 'Learning Action Completeness from Points for Weakly-supervised Temporal A

Pilhyeon Lee 67 Jan 03, 2023
Experimental code for paper: Generative Adversarial Networks as Variational Training of Energy Based Models

Experimental code for paper: Generative Adversarial Networks as Variational Training of Energy Based Models, under review at ICLR 2017 requirements: T

Shuangfei Zhai 18 Mar 05, 2022
Kindle is an easy model build package for PyTorch.

Kindle is an easy model build package for PyTorch. Building a deep learning model became so simple that almost all model can be made by copy and paste from other existing model codes. So why code? wh

Jongkuk Lim 77 Nov 11, 2022
Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning

Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning This is the official repository for Conservative and Adaptive Penalty fo

7 Nov 22, 2022
Fine-tuning StyleGAN2 for Cartoon Face Generation

Cartoon-StyleGAN 🙃 : Fine-tuning StyleGAN2 for Cartoon Face Generation Abstract Recent studies have shown remarkable success in the unsupervised imag

Jihye Back 520 Jan 04, 2023
Using deep learning to predict gene structures of the coding genes in DNA sequences of Arabidopsis thaliana

DeepGeneAnnotator: A tool to annotate the gene in the genome The master thesis of the "Using deep learning to predict gene structures of the coding ge

Ching-Tien Wang 3 Sep 09, 2022
This is the dataset for testing the robustness of various VO/VIO methods

KAIST VIO dataset This is the dataset for testing the robustness of various VO/VIO methods You can download the whole dataset on KAIST VIO dataset Ind

1 Sep 01, 2022
Implementation of the Triangle Multiplicative module, used in Alphafold2 as an efficient way to mix rows or columns of a 2d feature map, as a standalone package for Pytorch

Triangle Multiplicative Module - Pytorch Implementation of the Triangle Multiplicative module, used in Alphafold2 as an efficient way to mix rows or c

Phil Wang 22 Oct 28, 2022
The official re-implementation of the Neurips 2021 paper, "Targeted Neural Dynamical Modeling".

Targeted Neural Dynamical Modeling Note: This is a re-implementation (in Tensorflow2) of the original TNDM model. We do not plan to further update the

6 Oct 05, 2022
BMW TechOffice MUNICH 148 Dec 21, 2022
python debugger and anti-vm that checks if you're in a virtual machine or if someones trying to debug your file

Anti-Debug was made by Love ❌ code ✅ 🎉 ・What it checks for ・ Kills tools that can be used to debug your file ・ Exits if ran in vm (supports different

Rdimo 31 Aug 09, 2022
Code release for NeuS

NeuS We present a novel neural surface reconstruction method, called NeuS, for reconstructing objects and scenes with high fidelity from 2D image inpu

Peng Wang 813 Jan 04, 2023
Codebase for Time-series Generative Adversarial Networks (TimeGAN)

Codebase for Time-series Generative Adversarial Networks (TimeGAN)

Jinsung Yoon 532 Dec 31, 2022
Official PyTorch implementation of the paper Image-Based CLIP-Guided Essence Transfer.

TargetCLIP- official pytorch implementation of the paper Image-Based CLIP-Guided Essence Transfer This repository finds a global direction in StyleGAN

Hila Chefer 221 Dec 13, 2022
Code for Temporally Abstract Partial Models

Code for Temporally Abstract Partial Models Accompanies the code for the experimental section of the paper: Temporally Abstract Partial Models, Khetar

DeepMind 19 Jul 13, 2022
using STGCN to achieve egg classification task

EEG Classification   The task requires us to classify electroencephalography(EEG) into six categories, including human body, human face, animal body,

4 Jun 13, 2022
Dimension Reduced Turbulent Flow Data From Deep Vector Quantizers

Dimension Reduced Turbulent Flow Data From Deep Vector Quantizers This is an implementation of A Physics-Informed Vector Quantized Autoencoder for Dat

DreamSoul 3 Sep 12, 2022
docTR by Mindee (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.

docTR by Mindee (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.

Mindee 1.5k Jan 01, 2023
Deep Learning Tutorial for Kaggle Ultrasound Nerve Segmentation competition, using Keras

Deep Learning Tutorial for Kaggle Ultrasound Nerve Segmentation competition, using Keras This tutorial shows how to use Keras library to build deep ne

Marko Jocić 922 Dec 19, 2022