Users can transcribe their favorite piano recordings to MIDI files after installation

Overview

Piano transcription inference

This toolbox is a piano transcription inference package that can be easily installed. Users can transcribe their favorite piano recordings to MIDI files after installation. To see how the piano transcription system is trained, please visit: https://github.com/bytedance/piano_transcription.

Demos

Here is a demo of our piano transcription system: https://www.youtube.com/watch?v=5U-WL0QvKCg

Installation

The piano transcription system is developed with Python 3.7 and PyTorch 1.4.0 (Should work with other versions, but not fully tested). Install PyTorch following https://pytorch.org/. Users should have ffmpeg installed to transcribe mp3 files.

pip install piano_transcription_inference

Installation is finished!

Usage

Want to try it out but don't want to install anything? We have set up a Google Colab.

python3 example.py --audio_path='resources/cut_liszt.mp3' --output_midi_path='cut_liszt.mid' --cuda

This will download the pretrained model from https://zenodo.org/record/4034264.

Users could also execute the inference code line by line:

from piano_transcription_inference import PianoTranscription, sample_rate, load_audio

# Load audio
(audio, _) = load_audio(audio_path, sr=sample_rate, mono=True)

# Transcriptor
transcriptor = PianoTranscription(device='cuda', checkpoint_path=None)  # device: 'cuda' | 'cpu'

# Transcribe and write out to MIDI file
transcribed_dict = transcriptor.transcribe(audio, 'cut_liszt.mid')

Visualization of piano transcription

Demo. Lang Lang: Franz Liszt - Love Dream (Liebestraum) [audio] [transcribed_midi]

FAQs

This repo support Linux and Mac. Windows has not been tested.

If users met "audio.exceptions.NoBackendError", then check if ffmpeg is installed.

If users met the problem of "Killed". This is caused by there are not sufficient memory.

Applications

We have built a large-scale classical piano MIDI dataset https://github.com/bytedance/GiantMIDI-Piano using our piano transcription system.

Cite

[1] High-resolution Piano Transcription with Pedals by Regressing Onsets and Offsets Times, [To appear], 2020

A Music Player Bot for Discord Servers

A Music Player Bot for Discord Servers

Halil Acar 2 Oct 25, 2021
Carnatic Notes Predictor for audio files

Carnatic Notes Predictor for audio files Link for live application: https://share.streamlit.io/pradeepak1/carnatic-notes-predictor-for-audio-files/mai

1 Nov 06, 2021
Code for "Audio-driven Talking Face Video Generation with Learning-based Personalized Head Pose"

Audio-driven Talking Face Video Generation with Learning-based Personalized Head Pose We provide PyTorch implementations for our arxiv paper "Audio-dr

Ran Yi 497 Jan 09, 2023
A voice control utility for Spotify

Spotify Voice Control A voice control utility for Spotify · Report Bug · Request

Shoubhit Dash 27 Jan 01, 2023
Algorithmic and AI MIDI Drums Generator Implementation

Algorithmic and AI MIDI Drums Generator Implementation

Tegridy Code 8 Dec 30, 2022
nicfit 425 Jan 01, 2023
Audio augmentations library for PyTorch for audio in the time-domain

Audio augmentations library for PyTorch for audio in the time-domain, with support for stochastic data augmentations as used often in self-supervised / contrastive learning.

Janne 166 Jan 08, 2023
Datamoshing with FFmpeg

ffmosher Datamoshing with FFmpeg Drag and drop video onto mosh.bat to create a datamoshed video. To datamosh an image, please ensure the file is in a

18 Sep 11, 2022
Sequencer: Deep LSTM for Image Classification

Sequencer: Deep LSTM for Image Classification Created by Yuki Tatsunami Masato Taki This repository contains implementation for Sequencer. Abstract In

Yuki Tatsunami 111 Dec 16, 2022
A python script that can play .mp3 URLs upon the ringing or motion detection of a Ring doorbell. The sound plays through Sonos speakers.

Ring x Sonos A python script that plays .mp3 files whenever a doorbell is rung or a doorbell detects motion. Features Music! Authors @braden Running T

braden 0 Nov 12, 2021
:speech_balloon: SpeechPy - A Library for Speech Processing and Recognition: http://speechpy.readthedocs.io/en/latest/

SpeechPy Official Project Documentation Table of Contents Documentation Which Python versions are supported Citation How to Install? Local Installatio

Amirsina Torfi 870 Dec 27, 2022
Code for csig audio deepfake detection

FMFCC Audio Deepfake Detection Solution This repo provides an solution for the 多媒体伪造取证大赛. Our solution achieve the 1st in the Audio Deepfake Detection

BokingChen 9 Jun 04, 2022
commonfate 📦commonfate 📦 - Common Fate Model and Transform.

Common Fate Transform and Model for Python This package is a python implementation of the Common Fate Transform and Model to be used for audio source

Fabian-Robert Stöter 18 Jan 08, 2022
Pyroomacoustics is a package for audio signal processing for indoor applications. It was developed as a fast prototyping platform for beamforming algorithms in indoor scenarios.

Summary Pyroomacoustics is a software package aimed at the rapid development and testing of audio array processing algorithms. The content of the pack

Audiovisual Communications Laboratory 1k Jan 09, 2023
The official repository for Audio ALBERT

AALBERT Here is also the official repository of AALBERT, which is Pytorch lightning reimplementation of the paper, Audio ALBERT: A Lite Bert for Self-

pohan 55 Dec 11, 2022
Audio library for modelling loudness

Loudness Loudness is a C++ library with Python bindings for modelling perceived loudness. The library consists of processing modules which can be casc

Dominic Ward 33 Oct 02, 2022
An Amazon Music client for Linux (unpretentious)

Amusiz An Amazon Music client for Linux (unpretentious) ↗️ Install You can install Amusiz in multiple ways, choose your favorite. 🚀 AppImage Here you

Mirko Brombin 25 Nov 08, 2022
Converting UGG files from Rode Wireless Go II transmitters (unsompressed recordings) to WAV format

Rode_WirelessGoII_UGG2wav Converting UGG files from Rode Wireless Go II transmitters (uncompressed recordings) to WAV format Story I backuped the .ugg

Ján Mazanec 31 Dec 22, 2022
Music player and music library manager for Linux, Windows, and macOS

Ex Falso / Quod Libet - A Music Library / Editor / Player Quod Libet is a music management program. It provides several different ways to view your au

Quod Libet 1.2k Jan 07, 2023
This bot can stream audio or video files and urls in telegram voice chats

Voice Chat Streamer This bot can stream audio or video files and urls in telegram voice chats :) 🎯 Follow me and star this repo for more telegram bot

WiskeyWorm 4 Oct 09, 2022