Code release for paper: The Boombox: Visual Reconstruction from Acoustic Vibrations

Related tags

Deep Learningboombox
Overview

The Boombox: Visual Reconstruction from Acoustic Vibrations

Boyuan Chen, Mia Chiquier, Hod Lipson, Carl Vondrick
Columbia University

Project Website | Video | Paper

Overview

This repo contains the PyTorch implementation for paper "The Boombox: Visual Reconstruction from Acoustic Vibrations".

teaser

Content

Installation

Our code has been tested on Ubuntu 18.04 with CUDA 11.0. Create a python virtual environment and install the dependencies.

virtualenv -p /usr/bin/python3.6 env-boombox
source env-boombox/bin/activate
cd boombox
pip install -r requirements.txt

Data Preparation

Run the following commands to download the dataset (2.0G).

cd boombox
wget https://boombox.cs.columbia.edu/dataset/data.zip
unzip data.zip
rm -rf data.zip

After this step, you should see a folder named as data, and video and audio data are in cube, small_cuboid and large_cuboid subfolders.

About Configs and Logs

Before training and evaluation, we first introduce the configuration and logging structure.

  1. Configs: all the specific parameters used for training and evaluation are indicated as individual config file. Overall, we have two training paradigms: single-shape and multiple-shape.

    For single-shape, we train and evaluate on each shape separately. Their config files are named with their own shape: cube, large_cuboid and small_cuboid. For multiple-shape, we mix all the shapes together and perform training and evaluation while the shape is not known a priori. The config file folder is all.

    Within each config folder, we have config file for depth prediction and image prediction. The last digit in each folder refers to the random seed. For example, if you want to train our model with all the shapes mixed to output a RGB image with random seed 3, you should refer the parameters in:

    configs/all/2d_out_img_3
    
  2. Logs: both the training and evaluation results will be saved in the log folder for each experiment. The last digit in the logs folder indicates the random seed. Inside the logs folder, the structure and contents are:

    \logs_True_False_False_image_conv2d-encoder-decoder_True_{output_representation}_{seed}
        \lightning_logs
            \checkpoints               [saved checkpoint]
            \version_0                 [training stats]
            \version_1                 [testing stats]
        \pred_visualizations           [predicted and ground-truth images]
    

Training

Both training and evaluation are fast. We provide an example bash script for running our experiments in run_audio.sh. Specifically, to train our model on all shapes that outputs RGB image representations with random seed 1 and GPU 0, run the following command:

CUDA_VISIBLE_DEVICES=0 python main.py ./configs/all/2d_out_img_1/config.yaml;

Evaluation

Again, we provide an example bash script for running our experiments in run_audio.sh. Following the above example, to evaluate the trained model, run the following command:

CUDA_VISIBLE_DEVICES=0 python eval.py ./configs/all/2d_out_img_1/config.yaml ./logs_True_False_False_image_conv2d-encoder-decoder_True_pixel_1/lightning_logs/checkpoints;

License

This repository is released under the MIT license. See LICENSE for additional details.

Owner
Boyuan Chen
Ph.D. student in Computer Science at Columbia University Creative Machines Lab.
Boyuan Chen
LSTM Neural Networks for Spectroscopic Studies of Type Ia Supernovae

Package Description The difficulties in acquiring spectroscopic data have been a major challenge for supernova surveys. snlstm is developed to provide

7 Oct 11, 2022
Few-shot NLP benchmark for unified, rigorous eval

FLEX FLEX is a benchmark and framework for unified, rigorous few-shot NLP evaluation. FLEX enables: First-class NLP support Support for meta-training

AI2 85 Dec 03, 2022
An Active Automata Learning Library Written in Python

AALpy An Active Automata Learning Library AALpy is a light-weight active automata learning library written in pure Python. You can start learning auto

TU Graz - SAL Dependable Embedded Systems Lab (DES Lab) 78 Dec 30, 2022
The object detection pipeline is based on Ultralytics YOLOv5

AYOLOv2 The main goal of this repository is to rewrite the object detection pipeline with a better code structure for better portability and adaptabil

153 Dec 22, 2022
Shared Attention for Multi-label Zero-shot Learning

Shared Attention for Multi-label Zero-shot Learning Overview This repository contains the implementation of Shared Attention for Multi-label Zero-shot

dathuynh 26 Dec 14, 2022
Convert dog pictures into various painting styles. Try LimnPet

LimnPet Cartoon stylization service project Try our service » Home page · Team notion · Members 목차 프로젝트 소개 프로젝트 목표 사용한 기술스택과 수행도구 팀원 구현 기능 주요 기능 추가 기능

LiJell 7 Jul 14, 2022
Code for ViTAS_Vision Transformer Architecture Search

Vision Transformer Architecture Search This repository open source the code for ViTAS: Vision Transformer Architecture Search. ViTAS aims to search fo

46 Dec 17, 2022
A learning-based data collection tool for human segmentation

FullBodyFilter A Learning-Based Data Collection Tool For Human Segmentation Contents Documentation Source Code and Scripts Overview of Project Usage O

Robert Jiang 4 Jun 24, 2022
Official implementation of the paper WAV2CLIP: LEARNING ROBUST AUDIO REPRESENTATIONS FROM CLIP

Wav2CLIP 🚧 WIP 🚧 Official implementation of the paper WAV2CLIP: LEARNING ROBUST AUDIO REPRESENTATIONS FROM CLIP 📄 🔗 Ho-Hsiang Wu, Prem Seetharaman

Descript 240 Dec 13, 2022
Bringing sanity to world of messed-up data

Sanitize sanitize is a Python module for making sure various things (e.g. HTML) are safe to use. It was originally written by Mark Pilgrim and is dist

Alireza Savand 63 Oct 26, 2021
Probabilistic Gradient Boosting Machines

PGBM Probabilistic Gradient Boosting Machines (PGBM) is a probabilistic gradient boosting framework in Python based on PyTorch/Numba, developed by Air

Olivier Sprangers 112 Dec 28, 2022
PyTorch Code for NeurIPS 2021 paper Anti-Backdoor Learning: Training Clean Models on Poisoned Data.

Anti-Backdoor Learning PyTorch Code for NeurIPS 2021 paper Anti-Backdoor Learning: Training Clean Models on Poisoned Data. The Anti-Backdoor Learning

Yige-Li 51 Dec 07, 2022
Official pytorch implementation of Active Learning for deep object detection via probabilistic modeling (ICCV 2021)

Active Learning for Deep Object Detection via Probabilistic Modeling This repository is the official PyTorch implementation of Active Learning for Dee

NVIDIA Research Projects 130 Jan 06, 2023
SimpleDepthEstimation - An unified codebase for NN-based monocular depth estimation methods

SimpleDepthEstimation Introduction This is an unified codebase for NN-based monocular depth estimation methods, the framework is based on detectron2 (

8 Dec 13, 2022
Uses OpenCV and Python Code to detect a face on the screen

Simple-Face-Detection This code uses OpenCV and Python Code to detect a face on the screen. This serves as an example program. Important prerequisites

Denis Woolley (CreepyD) 1 Feb 12, 2022
Orbivator AI - To Determine which features of data (measurements) are most important for diagnosing breast cancer and find out if breast cancer occurs or not.

Orbivator_AI Breast Cancer Wisconsin (Diagnostic) GOAL To Determine which features of data (measurements) are most important for diagnosing breast can

anurag kumar singh 1 Jan 02, 2022
This is the official PyTorch implementation of the CVPR 2020 paper "TransMoMo: Invariance-Driven Unsupervised Video Motion Retargeting".

TransMoMo: Invariance-Driven Unsupervised Video Motion Retargeting Project Page | YouTube | Paper This is the official PyTorch implementation of the C

Zhuoqian Yang 330 Dec 11, 2022
Aerial Imagery dataset for fire detection: classification and segmentation (Unmanned Aerial Vehicle (UAV))

Aerial Imagery dataset for fire detection: classification and segmentation using Unmanned Aerial Vehicle (UAV) Title FLAME (Fire Luminosity Airborne-b

79 Jan 06, 2023
CrossNorm and SelfNorm for Generalization under Distribution Shifts (ICCV 2021)

CrossNorm (CN) and SelfNorm (SN) (Accepted at ICCV 2021) This is the official PyTorch implementation of our CNSN paper, in which we propose CrossNorm

100 Dec 28, 2022
This codebase proposes modular light python and pytorch implementations of several LiDAR Odometry methods

pyLiDAR-SLAM This codebase proposes modular light python and pytorch implementations of several LiDAR Odometry methods, which can easily be evaluated

Kitware, Inc. 208 Dec 16, 2022