Efficient neural networks for analog audio effect modeling

Overview

micro-TCN

Efficient neural networks for audio effect modeling.

| Paper | Demo | Plugin |

Setup

Install the requirements.

python3 -m venv env/
source env/bin/activate
pip install -r requirements.txt

Then install auraloss.

pip install git+https://github.com/csteinmetz1/auraloss

Pre-trained models

You can download the pre-trained models here. Then unzip as below.

mkdir lightning_logs
mv models.zip lightning_logs/
cd lightning_logs/
unzip models.zip 

Use the compy.py script in order to process audio files. Below is an example of how to run the TCN-300-C pre-trained model on GPU. This will process all the files in the audio/ directory with the limit mode engaged and a peak reduction of 42.

python comp.py -i audio/ --limit 1 --peak_red 42 --gpu

If you want to hear the output of a different model, you can pass the --model_id flag. To view the available pre-trained models (once you have downloaded them) run the following.

python comp.py --list_models

Found 13 models in ./lightning_logs/bulk
1-uTCN-300__causal__4-10-13__fraction-0.01-bs32
10-LSTM-32__1-32__fraction-1.0-bs32
11-uTCN-300__causal__3-60-5__fraction-1.0-bs32
13-uTCN-300__noncausal__30-2-15__fraction-1.0-bs32
14-uTCN-324-16__noncausal__10-2-15__fraction-1.0-bs32
2-uTCN-100__causal__4-10-5__fraction-1.0-bs32
3-uTCN-300__causal__4-10-13__fraction-1.0-bs32
4-uTCN-1000__causal__5-10-5__fraction-1.0-bs32
5-uTCN-100__noncausal__4-10-5__fraction-1.0-bs32
6-uTCN-300__noncausal__4-10-13__fraction-1.0-bs32
7-uTCN-1000__noncausal__5-10-5__fraction-1.0-bs32
8-TCN-300__noncausal__10-2-15__fraction-1.0-bs32
9-uTCN-300__causal__4-10-13__fraction-0.1-bs32

We also provide versions of the pre-trained models that have been converted to TorchScript for use in C++ here.

Evaluation

You will first need to download the SignalTrain dataset (~20GB) as well as the pre-trained models above. With this, you can then run the same evaluation pipeline used for reporting the metrics in the paper. If you would like to do this on GPU, perform the following command.

python test.py \
--root_dir /path/to/SignalTrain_LA2A_Dataset_1.1 \
--half \
--preload \
--eval_subset test \
--save_dir test_audio \

In this case, not only will the metrics be printed to terminal, we will also save out all of the processed audio from the test set to disk in the test_audio/ directory. If you would like to run the tests across the entire dataset you can specific a different string after the --eval_subset flag, as either train, val, or full.

Training

If would like to re-train the models in the paper, you can run the training script which will train all the models one by one.

python train.py \ 
--root_dir /path/to/SignalTrain_LA2A_Dataset_1.1 \
--precision 16 \
--preload \
--gpus 1 \

Plugin

We provide plugin builds (AV/VST3) for macOS. You can also build the plugin for your platform. This will require the traced models, which you can download here. First, you will need download and extract libtorch. Check the PyTorch site to find the correct version.

wget https://download.pytorch.org/libtorch/cpu/libtorch-macos-1.7.1.zip
unzip libtorch-macos-1.7.1.zip

Now move this into the realtime/ directory .

mv libtorch realtime/

We provide a ncomp.jucer file and a CMakeLists.txt that was created using FRUT. You will likely need to compile and run FRUT on this .jucer file in order to create a valid CMakeLists.txt. To do so, follow the instructions on compiling FRUT. Then convert the .jucer file. You will have to update the paths here to reflect the location of FRUT.

cd realtime/plugin/
../../FRUT/prefix/FRUT/bin/Jucer2CMake reprojucer ncomp.jucer ../../FRUT/prefix/FRUT/cmake/Reprojucer.cmake

Now you can finally build the plugin using CMake with the build.sh script. BUT, you will have to first update the path to libtorch in the build.sh script.

rm -rf build
mkdir build
cd build
cmake .. -G Xcode -DCMAKE_PREFIX_PATH=/absolute/path/to/libtorch ..
cmake --build .

Citation

If you use any of this code in your work, please consider citing us.

    @article{steinmetz2021efficient,
            title={Efficient Neural Networks for Real-time Analog Audio Effect Modeling},
            author={Steinmetz, Christian J. and Reiss, Joshua D.},
            journal={arXiv:2102.06200},
            year={2021}}
Owner
Christian Steinmetz
Building tools for musicians and audio engineers (often with machine learning). PhD Student at Queen Mary University of London.
Christian Steinmetz
Video Swin Transformer - PyTorch

Video-Swin-Transformer-Pytorch This repo is a simple usage of the official implementation "Video Swin Transformer". Introduction Video Swin Transforme

Haofan Wang 116 Dec 20, 2022
Non-Metric Space Library (NMSLIB): An efficient similarity search library and a toolkit for evaluation of k-NN methods for generic non-metric spaces.

Non-Metric Space Library (NMSLIB) Important Notes NMSLIB is generic but fast, see the results of ANN benchmarks. A standalone implementation of our fa

2.9k Jan 04, 2023
We present a framework for training multi-modal deep learning models on unlabelled video data by forcing the network to learn invariances to transformations applied to both the audio and video streams.

Multi-Modal Self-Supervision using GDT and StiCa This is an official pytorch implementation of papers: Multi-modal Self-Supervision from Generalized D

Facebook Research 42 Dec 09, 2022
Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.

WECHSEL Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models. arXiv: https://arx

Institute of Computational Perception 45 Dec 29, 2022
Self-Supervised Monocular 3D Face Reconstruction by Occlusion-Aware Multi-view Geometry Consistency[ECCV 2020]

Self-Supervised Monocular 3D Face Reconstruction by Occlusion-Aware Multi-view Geometry Consistency(ECCV 2020) This is an official python implementati

304 Jan 03, 2023
CSPML (crystal structure prediction with machine learning-based element substitution)

CSPML (crystal structure prediction with machine learning-based element substitution) CSPML is a unique methodology for the crystal structure predicti

8 Dec 20, 2022
Time Dependent DFT in Tamm-Dancoff Approximation

Density Function Theory Program - kspy-tddft(tda) This is an implementation of Time-Dependent Density Functional Theory(TDDFT) using the Tamm-Dancoff

Peter Borthwick 2 Nov 17, 2022
Rainbow is all you need! A step-by-step tutorial from DQN to Rainbow

Do you want a RL agent nicely moving on Atari? Rainbow is all you need! This is a step-by-step tutorial from DQN to Rainbow. Every chapter contains bo

Jinwoo Park (Curt) 1.4k Dec 29, 2022
Robust Instance Segmentation through Reasoning about Multi-Object Occlusion [CVPR 2021]

Robust Instance Segmentation through Reasoning about Multi-Object Occlusion [CVPR 2021] Abstract Analyzing complex scenes with DNN is a challenging ta

Irene Yuan 24 Jun 27, 2022
Computer vision - fun segmentation experience using classic and deep tools :)

Computer_Vision_Segmentation_Fun Segmentation of Images and Video. Tools: pytorch Models: Classic model - GrabCut Deep model - Deeplabv3_resnet101 Flo

Mor Ventura 1 Dec 18, 2021
Pyramid addon for OpenAPI3 validation of requests and responses.

Validate Pyramid views against an OpenAPI 3.0 document Peace of Mind The reason this package exists is to give you peace of mind when providing a REST

Pylons Project 79 Dec 30, 2022
SuRE Evaluation: A Supplementary Material

SuRE Evaluation: A Supplementary Material This repository contains supplementary material regarding the evaluations presented in the paper Visual Expl

NYU Visualization Lab 0 Dec 14, 2021
Classification models 1D Zoo - Keras and TF.Keras

Classification models 1D Zoo - Keras and TF.Keras This repository contains 1D variants of popular CNN models for classification like ResNets, DenseNet

Roman Solovyev 12 Jan 06, 2023
Repository for benchmarking graph neural networks

Benchmarking Graph Neural Networks Updates Nov 2, 2020 Project based on DGL 0.4.2. See the relevant dependencies defined in the environment yml files

NTU Graph Deep Learning Lab 2k Jan 03, 2023
Bayesian algorithm execution (BAX)

Bayesian Algorithm Execution (BAX) Code for the paper: Bayesian Algorithm Execution: Estimating Computable Properties of Black-box Functions Using Mut

Willie Neiswanger 38 Dec 08, 2022
Using Machine Learning to Create High-Res FineĀ Art

BIG.art: Using Machine Learning to Create High-Res Fine Art How to use GLIDE and BSRGAN to create ultra-high-resolution paintings with fine details By

Robert A. Gonsalves 13 Nov 27, 2022
Single-Stage 6D Object Pose Estimation, CVPR 2020

Overview This repository contains the code for the paper Single-Stage 6D Object Pose Estimation. Yinlin Hu, Pascal Fua, Wei Wang and Mathieu Salzmann.

CVLAB @ EPFL 89 Dec 26, 2022
The sixth place winning solution (6/220) in 2021 Gaofen Challenge.

SwinTransformer + OBBDet The sixth place winning solution (6/220) in the track of Fine-grained Object Recognition in High-Resolution Optical Images, 2

ming71 46 Dec 02, 2022
Using Machine Learning to Test Causal Hypotheses in Conjoint Analysis

Readme File for "Using Machine Learning to Test Causal Hypotheses in Conjoint Analysis" by Ham, Imai, and Janson. (2022) All scripts were written and

0 Jan 27, 2022
2021:"Bridging Global Context Interactions for High-Fidelity Image Completion"

TFill arXiv | Project This repository implements the training, testing and editing tools for "Bridging Global Context Interactions for High-Fidelity I

Chuanxia Zheng 111 Jan 08, 2023