Implicit neural differentiable FM synthesizer

Related tags

Audiofmsynth
Overview

Implicit neural differentiable FM synthesizer

Replicate

The purpose of this project is to emulate arbitrary sounds with FM synthesis, where the parameters of the FM synth are learned by optimization.

This idea was conceived and implemented during the Neural Audio Synthesis Hackathon 2021. Thanks to Ben Hayes for organizing the workshop and to Mia Chiquier for pointing me towards SIREN!

Architecture

Please refer to FMNet and Envelope in synth.py for the actual architectural details.

This model takes as input a list of time steps t_1, t_2, ..., sampled at some sample rate, and outputs an audio signal in the same sample rate.

Similar to SIREN, it feeds the input time step values through sinusoidal activation functions initialized with specific weights. In this work we initialize weights to 127 musical pitches from C#-1 to G9. We call this layer the "carrier".

We only use a single sinusoidal layer, but we modulate the frequencies of this layer with a summed output from a separate cosine layer with 127 cosine nodes, also initialized from musical pitches C#-1 to G9. We refer to this layer as the "modulator"

Each carrier and modulator node has both a frequency and an amplitude component. We learn a global phase in the range (0, 2*pi) that is shared among all carrier and modulator frequencies. This is effectively a global "bias" term to the sinusoidal activation functions.

The goal of this project is to provide a differentiable emulation of a simple FM synthesizer, so we take a softmax of both carrier and modulator layers' amplitudes.

In addition to carrier and modulator amplitudes we also learn separate amplitude envelope curves for each carrier and modulator node. The envelope is modeled by the bell curve function 1 / sqrt((1 + t * slope) + (slope + offset)).

Optimization

This model learns a implicit neural representation for a target audio signal. This means that we optimize the network once for every target signal.

We use the L2 loss between the generated audio signal and the target audio signal as the main loss function.

We also provide optional additional loss terms that maximize the "spikiness" of carrier and modulator amplitude vectors, in order to make the network pick a single carrier and modulator frequency. This term is optional since it sometimes learns more interesting sounds when several carrier and modulators are active.

We use the ADAM optimizer with a learning rate of 0.01.

Inference

Since this is an implicit neural representation, we can generate outputs at arbitrary sample rates and resolutions. This allows for seamless time stretching and upscaling.

The inference code also supports "stereo detuning" to create musically interesting sounds.

Owner
Andreas Jansson
Machine learning and music
Andreas Jansson
Mopidy is an extensible music server written in Python

Mopidy Mopidy is an extensible music server written in Python. Mopidy plays music from local disk, Spotify, SoundCloud, Google Play Music, and more. Y

Mopidy 7.6k Jan 05, 2023
Marsyas - Music Analysis, Retrieval and Synthesis for Audio Signals

Welcome to MARSYAS. MARSYAS is a software framework for rapid prototyping of audio applications, with flexibility and extensibility as primary concer

Marsyas Developers Group 364 Oct 31, 2022
Open Sound Strip, Sequence or Record in Audacity

Audacity Tools For Blender Sound editing in Blender Video Sequence Editor with Audacity integrated. Send/receive the full edited sequence or single st

64 Dec 31, 2022
This Bot can extract audios and subtitles from video files

Send any valid video file and the bot shows you available streams in it that can be extracted!!

TroJanzHEX 56 Nov 22, 2022
A collection of free MIDI chords and progressions ready to be used in your DAW, Akai MPC, or Roland MC-707/101

A collection of free MIDI chords and progressions ready to be used in your DAW, Akai MPC, or Roland MC-707/101

921 Jan 05, 2023
Python interface to the WebRTC Voice Activity Detector

py-webrtcvad This is a python interface to the WebRTC Voice Activity Detector (VAD). It is compatible with Python 2 and Python 3. A VAD classifies a p

John Wiseman 1.5k Dec 22, 2022
An audio guide for destroying oracles in Destiny's Vault of Glass raid

prophet An audio guide for destroying oracles in Destiny's Vault of Glass raid. This project allows you to make any encounter with oracles without hav

24 Sep 15, 2022
Music Streaming Platform based on full implementation of DBSM

Symphony Music Streaming Platform based on full implementation of DBSM List of Commands Insert User (INSERT) Function to implement input in USER Get a

Parth Maradia 1 Nov 12, 2021
A Python wrapper around the Soundcloud API

soundcloud-python A friendly wrapper around the Soundcloud API. Installation To install soundcloud-python, simply: pip install soundcloud Or if you'r

SoundCloud 84 Dec 31, 2022
Use python MIDI to write some simple music

Use Python MIDI to write songs

小宝 1 Nov 19, 2021
Algorithmic Multi-Instrumental MIDI Continuation Implementation

Matchmaker Algorithmic Multi-Instrumental MIDI Continuation Implementation Taming large-scale MIDI datasets with algorithms This is a WIP so please ch

Alex 2 Mar 11, 2022
A bot that can play music on Telegram Group and Channel Voice Chats

DaisyXmusic ❤ is the best and only Telegram VC player with playlists, Multi Playback, Channel play and more

TeamOfDaisyX 20 Jun 11, 2021
Official implementation of A cappella: Audio-visual Singing VoiceSeparation, from BMVC21

Y-Net Official implementation of A cappella: Audio-visual Singing VoiceSeparation, British Machine Vision Conference 2021 Project page: ipcv.github.io

Juan F. Montesinos 12 Oct 22, 2022
SU Music Player — The first open-source PyTgCalls based Pyrogram bot to play music in voice chats

SU Music Player — The first open-source PyTgCalls based Pyrogram bot to play music in voice chats Note Neither this, or PyTgCalls are fully

SU Projects 58 Jan 02, 2023
This is my voice assistant Patric!

voice-assistant This is my voice assistant Patric! You can add can add commands and even modify his name Indice How to use Installation guide How to u

Norbert Gabos 1 Jun 28, 2022
Converting UGG files from Rode Wireless Go II transmitters (unsompressed recordings) to WAV format

Rode_WirelessGoII_UGG2wav Converting UGG files from Rode Wireless Go II transmitters (uncompressed recordings) to WAV format Story I backuped the .ugg

Ján Mazanec 31 Dec 22, 2022
Pianote - An application that helps musicians practice piano ear training

Pianote Pianote is an application that helps musicians practice piano ear traini

3 Aug 17, 2022
Bot Music Pintar. Created by Rio

🎶 Rio Music 🎶 Kalo Fork Star Ya Bang Hehehe Requirements 📝 FFmpeg NodeJS nodesource.com Python 3.8+ or 3.7 PyTgCalls Generate String Using Replit ⤵

RioProjectX 7 Jun 15, 2022
Codes for "Efficient Long-Range Attention Network for Image Super-resolution"

ELAN Codes for "Efficient Long-Range Attention Network for Image Super-resolution", arxiv link. Dependencies & Installation Please refer to the follow

xindong zhang 124 Dec 22, 2022
Tradutor de um arquivo MIDI para ser usado em um simulador RISC-V(RARS)

Tradutor_MIDI-RISC-V Tradutor de um arquivo MIDI para ser usado em um simulador RISC-V(RARS) *O resultado sai com essa formatação: nota,duração,nota,d

Gabriel B. G. 4 Sep 02, 2022