PyTorch implementation of ECCV 2020 paper "Foley Music: Learning to Generate Music from Videos "

Last update: Nov 03, 2022

Related tags

Overview

Foley Music: Learning to Generate Music from Videos

This repo holds the code for the framework presented on ECCV 2020.

Foley Music: Learning to Generate Music from Videos Chuang Gan, Deng Huang, Peihao Chen, Joshua B. Tenenbaum, and Antonio Torralba

paper

Usage Guide

Prerequisites

The training and testing in PGCN is reimplemented in PyTorch for the ease of use.

Pytorch 1.4

Other minor Python modules can be installed by running

pip install -r requirements.txt

Data Preparation

Download Datasets

The extracted pose and midi for training and audio generation can be downloaded here and unzip to ./data folder.

The original datasets (including videos) can be found:

URMP: can be downloaded here
MUSIC: can be downloaded here
AtinPiano: proposed by At Your Fingertips: Automatic Piano Fingering Detection. The dataset can be downloaded here

Training

For URMP

CUDA_VISIBLE_DEVICES=6 python train.py -c config/URMP/violin.conf -e exps/urmp-vn

For AtinPiano

CUDA_VISIBLE_DEVICES=6 python train.py -c config/AtinPiano.conf -e exps/atinpiano

For MUSIC

CUDA_VISIBLE_DEVICES=6 python train.py -c config/MUSIC/accordion.conf -e exps/music-accordion

Generating MIDI, sounds and videos

For URMP

VIDEO_PATH=/path/to/video
INSTRUMENT_NAME='Violin'
python test_URMP.py exps/urmp-vn/checkpoint.pth.tar -o exps/urmp-vn/generate -i Violin -v $VIDEO_PATH -i $INSTRUMENT_NAME

For AtinPiano

VIDEO_PATH=/path/to/video
INSTRUMENT_NAME='Acoustic Grand Piano'
python test_AtinPiano_MUSIC.py exps/atinpiano/checkpoint.pth.tar -o exps/atinpiano/generation -v $VIDEO_PATH -i $INSTRUMENT_NAME

For MUSIC

VIDEO_PATH=/path/to/video
INSTRUMENT_NAME='Accordion'
python test_AtinPiano_MUSIC.py exps/music-accordion/checkpoint.pth.tar -o exps/music-accordion/generation -v $VIDEO_PATH -i $INSTRUMENT_NAME

Notes:

Instrument name ($INSTRUMENT_NAME) can be found here
If you do not have the video file or you want to generate MIDI and audio only, you can add -oa flag to skip the generation of video.

Other Info

Citation

Please cite the following paper if you feel our work useful to your research.

@inproceedings{FoleyMusic2020,
  author    = {Chuang Gan and
               Deng Huang and
               Peihao Chen and
               Joshua B. Tenenbaum and
               Antonio Torralba},
  title     = {Foley Music: Learning to Generate Music from Videos},
  booktitle = {ECCV},
  year      = {2020},
}

PyTorch implementation of ECCV 2020 paper "Foley Music: Learning to Generate Music from Videos "

Related tags

Overview

Foley Music: Learning to Generate Music from Videos

Usage Guide

Prerequisites

Data Preparation

Download Datasets

Training

Generating MIDI, sounds and videos

Other Info

Citation

Owner

Chuang Gan

Fine-grained Control of Image Caption Generation with Abstract Scene Graphs

Unofficial implementation of Pix2SEQ

Minimal diffusion models - Minimal code and simple experiments to play with Denoising Diffusion Probabilistic Models (DDPMs)

Code for DisCo: Remedy Self-supervised Learning on Lightweight Models with Distilled Contrastive Learning

某学校选课系统GIF验证码数据集 + Baseline模型 + 上下游相关工具

Nest Protect integration for Home Assistant. This will allow you to integrate your smoke, heat, co and occupancy status real-time in HA.

A simple and useful implementation of LPIPS.

HybridNets: End-to-End Perception Network

Air Quality Prediction Using LSTM

PAMI stands for PAttern MIning. It constitutes several pattern mining algorithms to discover interesting patterns in transactional/temporal/spatiotemporal databases

Contrastive Learning for Compact Single Image Dehazing, CVPR2021

An Api for Emotion recognition.

The fundamental package for scientific computing with Python.

Using contrastive learning and OpenAI's CLIP to find good embeddings for images with lossy transformations

Hummingbird compiles trained ML models into tensor computation for faster inference.

Transformer Huffman coding - Complete Huffman coding through transformer

NLMpy - A Python package to create neutral landscape models

Code for Deep Single-image Portrait Image Relighting

Code and Datasets from the paper "Self-supervised contrastive learning for volcanic unrest detection from InSAR data"

LAVT: Language-Aware Vision Transformer for Referring Image Segmentation