A two-stage U-Net for high-fidelity denoising of historical recordings

Overview

A two-stage U-Net for high-fidelity denoising of historical recordings

Official repository of the paper (not submitted yet):

E. Moliner and V. Välimäki,, "A two-stage U-Net for high-fidelity denosing of historical recordinds", in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Singapore, May, 2022

Abstract

Enhancing the sound quality of historical music recordings is a long-standing problem. This paper presents a novel denoising method based on a fully-convolutional deep neural network. A two-stage U-Net model architecture is designed to model and suppress the degradations with high fidelity. The method processes the time-frequency representation of audio, and is trained using realistic noisy data to jointly remove hiss, clicks, thumps, and other common additive disturbances from old analog discs. The proposed model outperforms previous methods in both objective and subjective metrics. The results of a formal blind listening test show that the method can denoise real gramophone recordings with an excellent quality. This study shows the importance of realistic training data and the power of deep learning in audio restoration.

Schema represention

Listen to our audio samples

Requirements

You will need at least python 3.7 and CUDA 10.1 if you want to use GPU. See requirements.txt for the required package versions.

To install the environment through anaconda, follow the instructions:

conda env update -f environment.yml
conda activate historical_denoiser

Denoising Recordings

Run the following commands to clone the repository and install the pretrained weights of the two-stage U-Net model:

git clone https://github.com/eloimoliner/denoising-historical-recordings.git
cd denoising-historical-recordings
wget https://github.com/eloimoliner/denoising-historical-recordings/releases/download/v0.0/checkpoint.zip
unzip checkpoint.zip /experiments/trained_model/

If the environment is installed correctly, you can denoise an audio file by running:

bash inference.sh "file name"

A ".wav" file with the denoised version, as well as the residual noise and the original signal in "mono", will be generated in the same directory as the input file.

Training

TODO

Comments
  • Will it work in Windows without CUDA?

    Will it work in Windows without CUDA?

    Hello, The readme says: "You will need at least python 3.7 and CUDA 10.1 if you want to use GPU."

    Unfortunately, my first attempt to run it in Windows without CUDA-supporting VGA failed. There is really no separate environment file for CPU-only? Is it possible to make it work without massive changes to the code?

    opened by vitacon 15
  • installation without conda

    installation without conda

    Hi,

    could you leave some hints about how to install this without conda? Your readme appears to be very much specified to this one case. Also it seems that you develop under linux so you use bash to execute. Maybe here a hint for win- users would be cool too.

    I am just trying to get this to run under windows and so far had no success. I will update if I get further. All the best!

    opened by GitHubGeniusOverlord 9
  • strange tensorflow version in requirements.txt

    strange tensorflow version in requirements.txt

    Hi,

    when running python -m pip install tensorflow==2.3.0 as indicated in your requirements file, I get

    ERROR: Could not find a version that satisfies the requirement tensorflow==2.3.0 (from versions: 2.5.0rc0, 2.5.0rc1, 2.5.0rc2, 2.5.0rc3, 2.5.0, 2.5.1, 2.5.2, 2.6.0rc0, 2.6.0rc1, 2.6.0rc2, 2.6.0, 2.6.1, 2.6.2, 2.7.0rc0, 2.7.0rc1, 2.7.0, 2.8.0rc0) ERROR: No matching distribution found for tensorflow==2.3.0

    It seems this version isn't even supported by pip anymore. Upgrade to 2.5.0?

    The same is true for scipy==1.4.1. Not sure about which version to take there.

    opened by GitHubGeniusOverlord 3
  • Update inference.sh

    Update inference.sh

    Small change to allow spaces in file names. Bash expands the variable $1 correctly even if it is in double quotes, python receives a single argument and not (if there are spaces) multiple arguments.

    opened by JorenSix 1
  • How to start training for denoising?

    How to start training for denoising?

    If I would like to do a denoising task, where I've clean signals (in the "clean" folder) and noisy signals (in the "noise" folder).

    opened by listener17 1
Releases(v0.0)
Owner
Eloi Moliner Juanpere
Doctoral candidate on audio signal processing at Aalto university.
Eloi Moliner Juanpere
Piotr - IoT firmware emulation instrumentation for training and research

Piotr: Pythonic IoT exploitation and Research Introduction to Piotr Piotr is an emulation helper for Qemu that provides a convenient way to create, sh

Damien Cauquil 51 Nov 09, 2022
DANet for Tabular data classification/ regression.

Deep Abstract Networks A PyTorch code implemented for the submission DANets: Deep Abstract Networks for Tabular Data Classification and Regression. Do

Ronnie Rocket 55 Sep 14, 2022
Patches desktop steam to look like the new steamdeck ui.

steam_deck_ui_patch The Deck UI patch will patch the regular desktop steam to look like the brand new SteamDeck UI. This patch tool currently works on

The_IT_Dude 3 Aug 29, 2022
Multi-Target Adversarial Frameworks for Domain Adaptation in Semantic Segmentation

Multi-Target Adversarial Frameworks for Domain Adaptation in Semantic Segmentation Paper Multi-Target Adversarial Frameworks for Domain Adaptation in

Valeo.ai 20 Jun 21, 2022
Supplemental Code for "ImpressionNet :A Multi view Approach to Predict Socio Facial Impressions"

Supplemental Code for "ImpressionNet :A Multi view Approach to Predict Socio Facial Impressions" Environment requirement This code is based on Python

Rohan Kumar Gupta 1 Dec 19, 2021
Optimizing DR with hard negatives and achieving SOTA first-stage retrieval performance on TREC DL Track (SIGIR 2021 Full Paper).

Optimizing Dense Retrieval Model Training with Hard Negatives Jingtao Zhan, Jiaxin Mao, Yiqun Liu, Jiafeng Guo, Min Zhang, Shaoping Ma 🔥 News 2021-10

Jingtao Zhan 99 Dec 27, 2022
A pytorch implementation of the CVPR2021 paper "VSPW: A Large-scale Dataset for Video Scene Parsing in the Wild"

VSPW: A Large-scale Dataset for Video Scene Parsing in the Wild A pytorch implementation of the CVPR2021 paper "VSPW: A Large-scale Dataset for Video

45 Nov 29, 2022
[CVPR2021] DoDNet: Learning to segment multi-organ and tumors from multiple partially labeled datasets

DoDNet This repo holds the pytorch implementation of DoDNet: DoDNet: Learning to segment multi-organ and tumors from multiple partially labeled datase

116 Dec 12, 2022
Prometheus exporter for Cisco Unified Computing System (UCS) Manager

prometheus-ucs-exporter Overview Use metrics from the UCS API to export relevant metrics to Prometheus This repository is a fork of Drew Stinnett's or

Marshall Wace 6 Nov 07, 2022
ROSITA: Enhancing Vision-and-Language Semantic Alignments via Cross- and Intra-modal Knowledge Integration

ROSITA News & Updates (24/08/2021) Release the demo to perform fine-grained semantic alignments using the pretrained ROSITA model. (15/08/2021) Releas

Vision and Language Group@ MIL 48 Dec 23, 2022
Fast and accurate optimisation for registration with little learningconvexadam

convexAdam Learn2Reg 2021 Submission Fast and accurate optimisation for registration with little learning Excellent results on Learn2Reg 2021 challeng

17 Dec 06, 2022
Music Generation using Neural Networks Streamlit App

Music_Gen_Streamlit "Music Generation using Neural Networks" Streamlit App TO DO: Make a run_app.sh Introduction [~5 min] (Sohaib) Team Member names/i

Muhammad Sohaib Arshid 6 Aug 09, 2022
basic tutorial on pytorch

Quick Tutorial on PyTorch PyTorch Basics Linear Regression Logistic Regression Artificial Neural Networks Convolutional Neural Networks Recurrent Neur

7 Sep 15, 2022
Weak-supervised Visual Geo-localization via Attention-based Knowledge Distillation

Weak-supervised Visual Geo-localization via Attention-based Knowledge Distillation Introduction WAKD is a PyTorch implementation for our ICPR-2022 pap

2 Oct 20, 2022
VoxHRNet - Whole Brain Segmentation with Full Volume Neural Network

VoxHRNet This is the official implementation of the following paper: Whole Brain Segmentation with Full Volume Neural Network Yeshu Li, Jonathan Cui,

Microsoft 12 Nov 24, 2022
IAST: Instance Adaptive Self-training for Unsupervised Domain Adaptation (ECCV 2020)

This repo is the official implementation of our paper "Instance Adaptive Self-training for Unsupervised Domain Adaptation". The purpose of this repo is to better communicate with you and respond to y

CVSM Group - email: <a href=[email protected]"> 84 Dec 12, 2022
Apply AnimeGAN-v2 across frames of a video clip

title emoji colorFrom colorTo sdk app_file pinned AnimeGAN-v2 For Videos 🔥 blue red gradio app.py false AnimeGAN-v2 For Videos Apply AnimeGAN-v2 acro

Nathan Raw 36 Oct 18, 2022
Code for "Learning Skeletal Graph Neural Networks for Hard 3D Pose Estimation" ICCV'21

Skeletal-GNN Code for "Learning Skeletal Graph Neural Networks for Hard 3D Pose Estimation" ICCV'21 Various deep learning techniques have been propose

37 Oct 23, 2022
Small-bets - Ergodic Experiment With Python

Ergodic Experiment Based on this video. Run this experiment with this command: p

Michael Brant 3 Jan 11, 2022
Code for ICML 2021 paper: How could Neural Networks understand Programs?

OSCAR This repository contains the source code of our ICML 2021 paper How could Neural Networks understand Programs?. Environment Run following comman

Dinglan Peng 115 Dec 17, 2022