Supplementary code for the experiments described in the 2021 ISMIR submission: Leveraging Hierarchical Structures for Few Shot Musical Instrument Recognition.

Overview

Music Trees

Supplementary code for the experiments described in the 2021 ISMIR submission: Leveraging Hierarchical Structures for Few Shot Musical Instrument Recognition.

train-test splits and hierarchies.

  • For all experiments, we used the instrument-based split in /music_trees/assets/partitions/mdb-aug.json.
  • To view our Hornbostel-Sachs class hierarchy, see /music_trees/assets/taxonomies/deeper-mdb.yaml. Note that not all of the instruments on this taxonomy are used in our experiments.
  • All random taxonomies are in /music_trees/assets/taxonomies/scrambled-*.yaml

Installation

first, clone the medleydb repo and install using pip install -e:

  • medleydb from marl

Now, download the medleydb and mdb 2.0 datasets from zenodo.

install some utilities for visualizing the embedding space:

git clone https://github.com/hugofloresgarcia/embviz.git
cd embviz
pip install -e .

then, clone this repo and install with

pip install -e .

Usage

1. Generate data

Make sure the MEDLEYDB_PATH environment variable is set (see the medleydb repo for more instructions ). Then, run the generation script:

python -m music_trees.generate \
                --dataset mdb \
                --name mdb-aug \
                --example_length 1.0 \
                --augment true \
                --hop_length 0.5 \
                --sample_rate 16000 \

This will generate both augmented and unaugmented data for MedleyDB. NOTE: There was a bug in the code that disabled data augmentation silently. This bug has been left in the code for the sake of reproducibility. This is why we don't report any data augmentation in the paper, as none was applied at the time of experiments.

2. Partition data

The partition file used for all experiments is available at /music_trees/assets/partitions/mdb-aug.json.

3. Run experiments

The search script will train all models for a particular experiment. It will grab as many GPUs are available (use CUDA_VISIBLE_DEVICES to change the availability of GPUs) and train as many models as it can in parallel.

Each model will be stored under /runs/<NAME>/<VERSION>.

Arbitrary Hierarchies

python music_trees/search.py --name scrambled-tax

Height Search (note that height=0 and height=1 are the baseline and proposed model, respectively)

python music_trees/search.py --name height-v1

Loss Ablation

python music_trees/search.py --name loss-alpha

train the additional BCE baseline:

python music_trees/train.py --model_name hprotonet --height 4 --d_root 128 --loss_alpha 1 --name "flat (BCE)" --dataset mdb-aug --learning_rate 0.03 --loss_weight_fn cross-entropy

4. Evaluate

Perform evaluation on a model. Make sure to pass the path to the run that you wish to evaluate.

To evaluate a model:

python music_trees/eval.py --exp_dir <PATH_TO_RUN>/<VERSION>

Each model will store its evaluation results under /results/<NAME>/<VERSION>

5. Analyze

To compare models and generate analysis figures and tables, place of all the results folders you would like to analyze under a single folder. The resulting folder should look like this:

my_experiment/trial1/version_0
my_experiment/trial2/version_0
my_experiment/trial3/version_0

Then, run analysis using

python music_trees analyze.py my_experiment   <OUTPUT_NAME> 

the figures will be created under /analysis/<OUTPUT_NAME>

To generate paper-ready figures, see scripts/figures.ipynb.

Owner
Hugo Flores García
PhD @interactiveaudiolab
Hugo Flores García
Simple sinc interpolation in PyTorch.

Kazane: simple sinc interpolation for 1D signal in PyTorch Kazane utilize FFT based convolution to provide fast sinc interpolation for 1D signal when

Chin-Yun Yu 10 May 03, 2022
Python-based Informatics Kit for Analysing Chemical Units

INSTALLATION Python-based Informatics Kit for the Analysis of Chemical Units Step 1: Make a conda environment: conda create -n pikachu python=3.9 cond

47 Dec 23, 2022
Official implementation of CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification

CrossViT This repository is the official implementation of CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification. ArXiv If

International Business Machines 168 Dec 29, 2022
Text-to-Image generation

Generate vivid Images for Any (Chinese) text CogView is a pretrained (4B-param) transformer for text-to-image generation in general domain. Read our p

THUDM 1.3k Dec 29, 2022
Auxiliary Raw Net (ARawNet) is a ASVSpoof detection model taking both raw waveform and handcrafted features as inputs, to balance the trade-off between performance and model complexity.

Overview This repository is an implementation of the Auxiliary Raw Net (ARawNet), which is ASVSpoof detection system taking both raw waveform and hand

6 Jul 08, 2022
Post-Training Quantization for Vision transformers.

PTQ4ViT Post-Training Quantization Framework for Vision Transformers. We use the twin uniform quantization method to reduce the quantization error on

Zhihang Yuan 61 Dec 28, 2022
Official repo for BMVC2021 paper ASFormer: Transformer for Action Segmentation

ASFormer: Transformer for Action Segmentation This repo provides training & inference code for BMVC 2021 paper: ASFormer: Transformer for Action Segme

42 Dec 23, 2022
A tiny, pedagogical neural network library with a pytorch-like API.

candl A tiny, pedagogical implementation of a neural network library with a pytorch-like API. The primary use of this library is for education. Use th

Sri Pranav 3 May 23, 2022
Model Zoo for MindSpore

Welcome to the Model Zoo for MindSpore In order to facilitate developers to enjoy the benefits of MindSpore framework, we will continue to add typical

MindSpore 226 Jan 07, 2023
Stochastic Tensor Optimization for Robot Motion - A GPU Robot Motion Toolkit

STORM Stochastic Tensor Optimization for Robot Motion - A GPU Robot Motion Toolkit [Install Instructions] [Paper] [Website] This package contains code

NVIDIA Research Projects 101 Dec 12, 2022
LLVM-based compiler for LightGBM gradient-boosted trees. Speeds up prediction by ≥10x.

LLVM-based compiler for LightGBM gradient-boosted trees. Speeds up prediction by ≥10x.

Simon Boehm 183 Jan 02, 2023
Clockwork Variational Autoencoder

Clockwork Variational Autoencoders (CW-VAE) Vaibhav Saxena, Jimmy Ba, Danijar Hafner If you find this code useful, please reference in your paper: @ar

Vaibhav Saxena 35 Nov 06, 2022
Action Recognition for Self-Driving Cars

Action Recognition for Self-Driving Cars This repo contains the codes for the 2021 Fall semester project "Action Recognition for Self-Driving Cars" at

VITA lab at EPFL 3 Apr 07, 2022
[CVPR2021] Invertible Image Signal Processing

Invertible Image Signal Processing This repository includes official codes for "Invertible Image Signal Processing (CVPR2021)". Figure: Our framework

Yazhou XING 281 Dec 31, 2022
TransFGU: A Top-down Approach to Fine-Grained Unsupervised Semantic Segmentation

TransFGU: A Top-down Approach to Fine-Grained Unsupervised Semantic Segmentation Zhaoyun Yin, Pichao Wang, Fan Wang, Xianzhe Xu, Hanling Zhang, Hao Li

DamoCV 25 Dec 16, 2022
Just Go with the Flow: Self-Supervised Scene Flow Estimation

Just Go with the Flow: Self-Supervised Scene Flow Estimation Code release for the paper Just Go with the Flow: Self-Supervised Scene Flow Estimation,

Himangi Mittal 50 Nov 22, 2022
Official code for "Mean Shift for Self-Supervised Learning"

MSF Official code for "Mean Shift for Self-Supervised Learning" Requirements Python = 3.7.6 PyTorch = 1.4 torchvision = 0.5.0 faiss-gpu = 1.6.1 In

UMBC Vision 44 Nov 21, 2022
OpenPCDet Toolbox for LiDAR-based 3D Object Detection.

OpenPCDet OpenPCDet is a clear, simple, self-contained open source project for LiDAR-based 3D object detection. It is also the official code release o

OpenMMLab 3.2k Dec 31, 2022
Evaluation toolkit of the informative tracking benchmark comprising 9 scenarios, 180 diverse videos, and new challenges.

Informative-tracking-benchmark Informative tracking benchmark (ITB) higher diversity. It contains 9 representative scenarios and 180 diverse videos. m

Xin Li 15 Nov 26, 2022
Reference implementation for Structured Prediction with Deep Value Networks

Deep Value Network (DVN) This code is a python reference implementation of DVNs introduced in Deep Value Networks Learn to Evaluate and Iteratively Re

Michael Gygli 55 Feb 02, 2022