music-ai

Deep learning transformer model that generates unique music sequences.

Abstract

In 2017, a new state-of-the-art was published for natural language processing: the Transformer. Relying solely on attention mechanisms, the Transformer outperformed existing solutions based on recurrent and convolutional neural networks¹. However, recurrent neural networks, long short-term memory, and gated recurrent neural networks remain dominant in the field of generative music. I aim to introduce the Transformer into the field of music, with the goal of teaching the deep learning model to predict the second half of a composition given the first half. A Transformer equipped with 32 attention heads and sinusoidal positional encoding was trained on the Nottingham MIDI dataset for 5000 epochs over a period of 48 hours, optimized by stochastic gradient descent and measured with cross entropy loss, and regulated by an exponential learning rate decrease schedule. For the first thousand epochs, the model had noticeable improvement but lacked arrangement to the generated sequences. By five thousand epochs, the model clearly demonstrated the knowledge of general music trends used to better predict how classical composers write their pieces, and most tracks were melodic to the human ear. Future applications of this technique include generating tracks for various instruments, rating the quality of existing music tracks, and complete originality if combined with a generative network mapping melodies to latent space.

¹ Attention Is All You Need

Video

Hardware

Ubuntu

32 GB RAM
Intel Core i3-4170 CPU @3.70 GHz x4 (4 GB RAM)
NVIDIA GeForce GTX 1050 Ti

Deep learning transformer model that generates unique music sequences.

Related tags

Overview

music-ai

Abstract

Video

Hardware

Owner

xacer

Python wrapper around sox.

Python tools for the corpus analysis of popular music.

An audio digital processing toolbox based on a workflow/pipeline principle

An app made in Python using the PyTube and Tkinter libraries to download videos and MP3 audio.

GiantMIDI-Piano is a classical piano MIDI dataset contains 10,854 MIDI files of 2,786 composers

A Python wrapper for the high-quality vocoder "World"

Code for paper 'Audio-Driven Emotional Video Portraits'.

Open-Source Tools & Data for Music Source Separation: A Pragmatic Guide for the MIR Practitioner

Any-to-any voice conversion using synthetic specific-speaker speeches as intermedium features

Tune in is a Collaborative Music Playing Systems where multiple guests can join a room and enjoy the song being played

voice assistant made with python that search for covid19 data(like total cases, deaths and etc) in a specific country

Okaeri-Music is a telegram music bot project, allow you to play music on voice chat group telegram.

A Youtube audio player for your terminal

Identify the emotion of multiple speakers in an Audio Segment

Gateware for the Terasic/Arrow DECA board, to become a USB2 high speed audio interface

Okaeri-Music is a telegram music bot project, allow you to play music on voice chat group telegram.

Audio2midi - Automatic Audio-to-symbolic Arrangement

𝙰 𝙼𝚞𝚜𝚒𝚌 𝙱𝚘𝚝 𝙲𝚛𝚎𝚊𝚝𝚎𝚍 𝙱𝚢 𝚃𝚎𝚊𝚖𝙳𝚕𝚝 💖

Voice helper on russian

A python package for calculating the PESQ.