Symbolic Music Generation with Diffusion Models

Overview

Symbolic Music Generation with Diffusion Models

Supplementary code release for our work Symbolic Music Generation with Diffusion Models.

Installation

All code is written in Python 3 (Anaconda recommended). To install the dependencies:

pip install -r requirements.txt

A copy of the Magenta codebase is required for access to MusicVAE and related components. Installation instructions can be found on the Magenta public repository. You will also need to download pretrained MusicVAE checkpoints. For our experiments, we use the 2-bar melody model.

Datasets

We use the Lakh MIDI Dataset to train our models. Follow these instructions to download and build the Lakh MIDI Dataset.

To encode the Lakh dataset with MusicVAE, use scripts/generate_song_data_beam.py:

python scripts/generate_song_data_beam.py \
  --checkpoint=/path/to/musicvae-ckpt \
  --input=/path/to/lakh_tfrecords \
  --output=/path/to/encoded_tfrecords

To preprocess and generate fixed-length latent sequences for training diffusion and autoregressive models, refer to scripts/transform_encoded_data.py:

python scripts/transform_encoded_data.py \
  --encoded_data=/path/to/encoded_tfrecords \
  --output_path =/path/to/preprocess_tfrecords \
  --mode=sequences \
  --context_length=32

Training

Diffusion

python train_ncsn.py --flagfile=configs/ddpm-mel-32seq-512.cfg

TransformerMDN

python train_mdn.py --flagfile=configs/mdn-mel-32seq-512.cfg

Sampling and Generation

Diffusion

python sample_ncsn.py \
  --flagfile=configs/ddpm-mel-32seq-512.cfg \
  --sample_seed=42 \
  --sample_size=1000 \
  --sampling_dir=/path/to/latent-samples 

TransformerMDN

python sample_ncsn.py \
  --flagfile=configs/mdn-mel-32seq-512.cfg \
  --sample_seed=42 \
  --sample_size=1000 \
  --sampling_dir=/path/to/latent-samples 

Decoding sequences

To convert sequences of embeddings (generated by diffusion or TransformerMDN models) to sequences of MIDI events, refer to scripts/sample_audio.py.

python scripts/sample_audio.py
  --input=/path/to/latent-samples/[ncsn|mdn] \
  --output=/path/to/audio-midi \
  --n_synth=1000 \
  --include_wav=True

Citing

If you use this code please cite it as:

@inproceedings{
  mittal2021symbolicdiffusion,
  title={Symbolic Music Generation with Diffusion Models},
  author={Gautam Mittal and Jesse Engel and Curtis Hawthorne and Ian Simon},
  booktitle={Proceedings of the 22nd International Society for Music Information Retrieval Conference},
  year={2021},
  url={https://archives.ismir.net/ismir2021/paper/000058.pdf}
}

Note

This is not an official Google product.

Owner
Magenta
An open source research project exploring the role of machine learning as a tool in the creative process.
Magenta
A module for solving and visualizing Schrödinger equation.

qmsolve This is an attempt at making a solid, easy to use solver, capable of solving and visualize the Schrödinger equation for multiple particles, an

506 Dec 28, 2022
Session-based Recommendation, CoHHN, price preferences, interest preferences, Heterogeneous Hypergraph, Co-guided Learning, SIGIR2022

This is our implementation for the paper: Price DOES Matter! Modeling Price and Interest Preferences in Session-based Recommendation Xiaokun Zhang, Bo

Xiaokun Zhang 27 Dec 02, 2022
Code for ACM MM 2020 paper "NOH-NMS: Improving Pedestrian Detection by Nearby Objects Hallucination"

NOH-NMS: Improving Pedestrian Detection by Nearby Objects Hallucination The offical implementation for the "NOH-NMS: Improving Pedestrian Detection by

Tencent YouTu Research 64 Nov 11, 2022
Implementation of TimeSformer, a pure attention-based solution for video classification

TimeSformer - Pytorch Implementation of TimeSformer, a pure and simple attention-based solution for reaching SOTA on video classification.

Phil Wang 602 Jan 03, 2023
A Python library for adversarial machine learning focusing on benchmarking adversarial robustness.

ARES This repository contains the code for ARES (Adversarial Robustness Evaluation for Safety), a Python library for adversarial machine learning rese

Tsinghua Machine Learning Group 377 Dec 20, 2022
9th place solution in "Santa 2020 - The Candy Cane Contest"

Santa 2020 - The Candy Cane Contest My solution in this Kaggle competition "Santa 2020 - The Candy Cane Contest", 9th place. Basic Strategy In this co

toshi_k 22 Nov 26, 2021
Keras implementations of Generative Adversarial Networks.

This repository has gone stale as I unfortunately do not have the time to maintain it anymore. If you would like to continue the development of it as

Erik Linder-Norén 8.9k Jan 04, 2023
This tool uses Deep Learning to help you draw and write with your hand and webcam.

This tool uses Deep Learning to help you draw and write with your hand and webcam. A Deep Learning model is used to try to predict whether you want to have 'pencil up' or 'pencil down'.

lmagne 169 Dec 10, 2022
The deployment framework aims to provide a simple, lightweight, fast integrated, pipelined deployment framework that ensures reliability, high concurrency and scalability of services.

savior是一个能够进行快速集成算法模块并支持高性能部署的轻量开发框架。能够帮助将团队进行快速想法验证(PoC),避免重复的去github上找模型然后复现模型;能够帮助团队将功能进行流程拆解,很方便的提高分布式执行效率;能够有效减少代码冗余,减少不必要负担。

Tao Luo 125 Dec 22, 2022
SpeechNAS Better Trade off between Latency and Accuracy for Large Scale Speaker Verification

SpeechNAS Better Trade off between Latency and Accuracy for Large Scale Speaker Verification

Wentao Zhu 24 May 20, 2022
Voice Conversion by CycleGAN (语音克隆/语音转换):CycleGAN-VC3

CycleGAN-VC3-PyTorch 中文说明 | English This code is a PyTorch implementation for paper: CycleGAN-VC3: Examining and Improving CycleGAN-VCs for Mel-spectr

Kun Ma 110 Dec 24, 2022
Garbage Detection system which will detect objects based on whether it is plastic waste or plastics or just garbage.

Garbage Detection using Yolov5 on Jetson Nano 2gb Developer Kit. Garbage detection system which will detect objects based on whether it is plastic was

Rishikesh A. Bondade 2 May 13, 2022
official implemntation for "Contrastive Learning with Stronger Augmentations"

CLSA CLSA is a self-supervised learning methods which focused on the pattern learning from strong augmentations. Copyright (C) 2020 Xiao Wang, Guo-Jun

Lab for MAchine Perception and LEarning (MAPLE) 47 Nov 29, 2022
Sarus implementation of classical ML models. The models are implemented using the Keras API of tensorflow 2. Vizualization are implemented and can be seen in tensorboard.

Sarus published models Sarus implementation of classical ML models. The models are implemented using the Keras API of tensorflow 2. Vizualization are

Sarus Technologies 39 Aug 19, 2022
Intent parsing and slot filling in PyTorch with seq2seq + attention

PyTorch Seq2Seq Intent Parsing Reframing intent parsing as a human - machine translation task. Work in progress successor to torch-seq2seq-intent-pars

Sean Robertson 160 Jan 07, 2023
A small demonstration of using WebDataset with ImageNet and PyTorch Lightning

A small demonstration of using WebDataset with ImageNet and PyTorch Lightning This is a small repo illustrating how to use WebDataset on ImageNet. usi

50 Dec 16, 2022
A repository for storing njxzc final exam review material

文档地址,请戳我 👈 👈 👈 ☀️ 1.Reason 大三上期末复习软件工程的时候,发现其他高校在GitHub上开源了他们学校的期末试题,我很受触动。期末

GuJiakai 2 Jan 18, 2022
Adaptation through prediction: multisensory active inference torque control

Adaptation through prediction: multisensory active inference torque control Submitted to IEEE Transactions on Cognitive and Developmental Systems Abst

Cristian Meo 1 Nov 07, 2022
A few stylization coreML models that I've trained with CreateML

CoreML-StyleTransfer A few stylization coreML models that I've trained with CreateML You can open and use the .mlmodel files in the "models" folder in

Doron Adler 8 Aug 18, 2022
A Python multilingual toolkit for Sentiment Analysis and Social NLP tasks

pysentimiento: A Python toolkit for Sentiment Analysis and Social NLP tasks A Transformer-based library for SocialNLP classification tasks. Currently

298 Jan 07, 2023