FIGARO: Generating Symbolic Music with Fine-Grained Artistic Control

Related tags

Deep Learningfigaro
Overview

FIGARO: Generating Symbolic Music with Fine-Grained Artistic Control

by Dimitri von Rรผtte, Luca Biggio, Yannic Kilcher, Thomas Hofmann

Getting started

Prerequisites:

  • Python 3.9
  • Conda

Setup

  1. Clone this repository to your disk
  2. Install required packages (see requirements.txt). With Conda:
conda create --name figaro python=3.9
conda activate figaro
pip install -r requirements.txt

Preparing the Data

To train models and to generate new samples, we use the Lakh MIDI dataset (altough any collection of MIDI files can be used).

  1. Download (size: 1.6GB) and extract the archive file:
wget http://hog.ee.columbia.edu/craffel/lmd/lmd_full.tar.gz
tar -xzf lmd_full.tar.gz
  1. You may wish to remove the archive file now: rm lmd_full.tar.gz

Download Pre-Trained Models

If you don't wish to train your own models, you can download our pre-trained models.

  1. Download (size: 2.3GB) and extract the archive file:
wget -O checkpoints.zip https://polybox.ethz.ch/index.php/s/a0HUHzKuPPefWkW/download
unzip checkpoints.zip
  1. You may wish to remove the archive file now: rm checkpoints.zip

Training

Training arguments such as model type, batch size, model params are passed to the training scripts via environment variables.

Available model types are:

  • vq-vae: VQ-VAE model used for the learned desription
  • figaro: FIGARO with both the expert and learned description
  • figaro-expert: FIGARO with only the expert description
  • figaro-learned: FIGARO with only the learned description
  • figaro-no-inst: FIGARO (expert) without instruments
  • figaro-no-chord: FIGARO (expert) without chords
  • figaro-no-meta: FIGARO (expert) without style (meta) information
  • baseline: Unconditional decoder-only baseline following Huang et al. (2018)

Example invocation of the training script is given by the following command:

MODEL=figaro-expert python src/train.py

For models using the learned description (figaro and figaro-learned), a pre-trained VQ-VAE checkpoint needs to be provided as well:

MODEL=figaro VAE_CHECKPOINT=./checkpoints/vq-vae.ckpt python src/train.py

Generation

To generate samples, make sure you have a trained checkpoint prepared (either download one or train it yourself). For this script, make sure that the dataset is prepared according to Preparing the Data. This is needed to extract descriptions, based on which new samples can be generated.

An example invocation of the generation script is given by the following command:

MODEL=figaro-expert CHECKPOINT=./checkpoints/figaro-expert.ckpt python src/generate.py

For models using the learned description (figaro and figaro-learned), a pre-trained VQ-VAE checkpoint needs to be provided as well:

MODEL=figaro CHECKPOINT=./checkpoints/figaro.ckpt VAE_CHECKPOINT=./checkpoints/vq-vae.ckpt python src/generate.py

Evaluation

We provide the evaluation scripts used to calculate the desription metrics on some set of generated samples. Refer to the previous section for how to generate samples yourself.

Example usage:

SAMPLE_DIR=./samples/figaro-expert python src/evaluate.py

Parameters

The following environment variables are available for controlling hyperparameters beyond their default value.

Training (train.py)

Model

Variable Description Default value
MODEL Model architecture to be trained
D_MODEL Hidden size of the model 512
CONTEXT_SIZE Number of tokens in the context to be passed to the auto-encoder 256
D_LATENT [VQ-VAE] Dimensionality of the latent space 1024
N_CODES [VQ-VAE] Codebook size 2048
N_GROUPS [VQ-VAE] Number of groups to split the latent vector into before discretization 16

Optimization

Variable Description Default value
EPOCHS Max. number of training epochs 16
MAX_TRAINING_STEPS Max. number of training iterations 100,000
BATCH_SIZE Number of samples in each batch 128
TARGET_BATCH_SIZE Number of samples in each backward step, gradients will be accumulated over TARGET_BATCH_SIZE//BATCH_SIZE batches 256
WARMUP_STEPS Number of learning rate warmup steps 4000
LEARNING_RATE Initial learning rate, will be decayed after constant warmup of WARMUP_STEPS steps 1e-4

Others

Variable Description Default value
CHECKPOINT Path to checkpoint from which to resume training
VAE_CHECKPOINT Path to VQ-VAE checkpoint to be used for the learned description
ROOT_DIR The folder containing MIDI files to train on ./lmd_full
OUTPUT_DIR Folder for saving checkpoints ./results
LOGGING_DIR Folder for saving logs ./logs
N_WORKERS Number of workers to be used for the dataloader available CPUs

Generation (generate.py)

Variable Description Default value
MODEL Specify which model will be loaded
CHECKPOINT Path to the checkpoint for the specified model
VAE_CHECKPOINT Path to the VQ-VAE checkpoint to be used for the learned description (if applicable)
ROOT_DIR Folder containing MIDI files to extract descriptions from ./lmd_full
OUTPUT_DIR Folder to save generated MIDI samples to ./samples
MAX_ITER Max. number of tokens that should be generated 16,000
MAX_BARS Max. number of bars that should be generated 32
MAKE_MEDLEYS Set to True if descriptions should be combined into medleys. False
N_MEDLEY_PIECES Number of pieces to be combined into one 2
N_MEDLEY_BARS Number of bars to take from each piece 16
VERBOSE Logging level, set to 0 for silent execution 2

Evaluation (evaluate.py)

Variable Description Default value
SAMPLE_DIR Folder containing generated samples which should be evaluated ./samples
OUT_FILE CSV file to which a detailed log of all metrics will be saved to ./metrics.csv
MAX_SAMPLES Limit the number of samples to be used for computing evaluation metrics 1024
Owner
Dimitri
Dimitri
RODD: A Self-Supervised Approach for Robust Out-of-Distribution Detection

RODD Official Implementation of 2022 CVPRW Paper RODD: A Self-Supervised Approach for Robust Out-of-Distribution Detection Introduction: Recent studie

Umar Khalid 17 Oct 11, 2022
Official repository of my book: "Deep Learning with PyTorch Step-by-Step: A Beginner's Guide"

This is the official repository of my book "Deep Learning with PyTorch Step-by-Step". Here you will find one Jupyter notebook for every chapter in the book.

Daniel Voigt Godoy 340 Jan 01, 2023
Enhancing Aspect-Based Sentiment Analysis with Supervised Contrastive Learning.

Enhancing Aspect-Based Sentiment Analysis with Supervised Contrastive Learning. Enhancing Aspect-Based Sentiment Analysis with Supervised Contrastive

<a href=[email protected](SZ)"> 7 Dec 16, 2021
links and status of cool gradio demos

awesome-demos This is a list of some wonderful demos & applications built with Gradio. Here's how to contribute yours! ๐Ÿ–Š๏ธ Natural language processing

Gradio 96 Dec 30, 2022
[CVPR 2020] 3D Photography using Context-aware Layered Depth Inpainting

[CVPR 2020] 3D Photography using Context-aware Layered Depth Inpainting [Paper] [Project Website] [Google Colab] We propose a method for converting a

Virginia Tech Vision and Learning Lab 6.2k Jan 01, 2023
A Demo server serving Bert through ONNX with GPU written in Rust with <3

Demo BERT ONNX server written in rust This demo showcase the use of onnxruntime-rs on BERT with a GPU on CUDA 11 served by actix-web and tokenized wit

Xavier Tao 28 Jan 01, 2023
The source code of CVPR17 'Generative Face Completion'.

GenerativeFaceCompletion Matcaffe implementation of our CVPR17 paper on face completion. In each panel from left to right: original face, masked input

Yijun Li 313 Oct 18, 2022
Pre-Training Graph Neural Networks for Cold-Start Users and Items Representation.

Pretrain-Recsys This is our Tensorflow implementation for our WSDM 2021 paper: Bowen Hao, Jing Zhang, Hongzhi Yin, Cuiping Li, Hong Chen. Pre-Training

30 Nov 14, 2022
Library of various Few-Shot Learning frameworks for text classification

FewShotText This repository contains code for the paper A Neural Few-Shot Text Classification Reality Check Environment setup # Create environment pyt

Thomas Dopierre 47 Jan 03, 2023
Download & Install mods for your favorit game with a few simple clicks

Husko's SteamWorkshop Downloader ๐Ÿ”ด IMPORTANT โ— ๐Ÿ”ด The Tool is currently being rewritten so updates will be slow and only on the dev branch until it i

Husko 67 Nov 25, 2022
ACL'2021: LM-BFF: Better Few-shot Fine-tuning of Language Models

LM-BFF (Better Few-shot Fine-tuning of Language Models) This is the implementation of the paper Making Pre-trained Language Models Better Few-shot Lea

Princeton Natural Language Processing 607 Jan 07, 2023
Modular Probabilistic Programming on MXNet

MXFusion | | | | Tutorials | Documentation | Contribution Guide MXFusion is a modular deep probabilistic programming library. With MXFusion Modules yo

Amazon 100 Dec 10, 2022
A TensorFlow 2.x implementation of Masked Autoencoders Are Scalable Vision Learners

Masked Autoencoders Are Scalable Vision Learners A TensorFlow implementation of Masked Autoencoders Are Scalable Vision Learners [1]. Our implementati

Aritra Roy Gosthipaty 59 Dec 10, 2022
This is the code of NeurIPS'21 paper "Towards Enabling Meta-Learning from Target Models".

ST This is the code of NeurIPS 2021 paper "Towards Enabling Meta-Learning from Target Models". If you use any content of this repo for your work, plea

Su Lu 7 Dec 06, 2022
Stochastic Tensor Optimization for Robot Motion - A GPU Robot Motion Toolkit

STORM Stochastic Tensor Optimization for Robot Motion - A GPU Robot Motion Toolkit [Install Instructions] [Paper] [Website] This package contains code

NVIDIA Research Projects 101 Dec 12, 2022
A learning-based data collection tool for human segmentation

FullBodyFilter A Learning-Based Data Collection Tool For Human Segmentation Contents Documentation Source Code and Scripts Overview of Project Usage O

Robert Jiang 4 Jun 24, 2022
OptNet: Differentiable Optimization as a Layer in Neural Networks

OptNet: Differentiable Optimization as a Layer in Neural Networks This repository is by Brandon Amos and J. Zico Kolter and contains the PyTorch sourc

CMU Locus Lab 428 Dec 24, 2022
Implementation of Kaneko et al.'s MaskCycleGAN-VC model for non-parallel voice conversion.

MaskCycleGAN-VC Unofficial PyTorch implementation of Kaneko et al.'s MaskCycleGAN-VC (2021) for non-parallel voice conversion. MaskCycleGAN-VC is the

86 Dec 25, 2022
Official implementation of the paper "Topographic VAEs learn Equivariant Capsules"

Topographic Variational Autoencoder Paper: https://arxiv.org/abs/2109.01394 Getting Started Install requirements with Anaconda: conda env create -f en

T. Andy Keller 69 Dec 12, 2022