Learning the Beauty in Songs: Neural Singing Voice Beautifier; ACL 2022 (Main conference); Official code

Last update: Dec 30, 2022

Overview

Learning the Beauty in Songs: Neural Singing Voice Beautifier

Jinglin Liu, Chengxi Li, Yi Ren, Zhiying Zhu, Zhou Zhao

Zhejiang University

ACL 2022 Main conference

Project Page

🚧 ⛏️ 🛠️ 👷

This repository is the official PyTorch implementation of our ACL-2022 paper. Now, we release the codes for SADTW algorithm in our paper. The current expected release time of the full version codes and data is at the ACL-2022 conference (before June. 2022). Please star us and stay tuned!

|--modules
    |--voice_conversion
        |--dtw
            |--enhance_sadtw.py  (Our algorithm)
|--tasks
    |--singing
        |--pitch_alignment_task.py  (Usage example)

🚀 News:

Feb.24, 2022: Our new work, NeuralSVB was accepted by ACL-2022. Demo Page.
Dec.01, 2021: Our recent work DiffSinger was accepted by AAAI-2022. | .
Sep.29, 2021: Our recent work PortaSpeech was accepted by NeurIPS-2021. .
May.06, 2021: We submitted DiffSinger to Arxiv .

Abstract

We are interested in a novel task, singing voice beautifying (SVB). Given the singing voice of an amateur singer, SVB aims to improve the intonation and vocal tone of the voice, while keeping the content and vocal timbre. Current automatic pitch correction techniques are immature, and most of them are restricted to intonation but ignore the overall aesthetic quality. Hence, we introduce Neural Singing Voice Beautifier (NSVB), the first generative model to solve the SVB task, which adopts a conditional variational autoencoder as the backbone and learns the latent representations of vocal tone. In NSVB, we propose a novel time-warping approach for pitch correction: Shape-Aware Dynamic Time Warping (SADTW), which ameliorates the robustness of existing time-warping approaches, to synchronize the amateur recording with the template pitch curve. Furthermore, we propose a latent-mapping algorithm in the latent space to convert the amateur vocal tone to the professional one. Extensive experiments on both Chinese and English songs demonstrate the effectiveness of our methods in terms of both objective and subjective metrics.

Issues

Before raising a issue, please check our Readme and other issues for possible solutions.
We will try to handle your problem in time but we could not guarantee a satisfying solution.
Please be friendly.

Code for the paper "Balancing Training for Multilingual Neural Machine Translation, ACL 2020"

Balancing Training for Multilingual Neural Machine Translation Implementation of the paper Balancing Training for Multilingual Neural Machine Translat

21 May 18, 2022

[CVPR 2022] Official code for the paper: "A Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved Neural Network Calibration"

MDCA Calibration This is the official PyTorch implementation for the paper: "A Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved

21 Dec 22, 2022

Y. Zhang, Q. Yao, W. Dai, L. Chen. AutoSF: Searching Scoring Functions for Knowledge Graph Embedding. IEEE International Conference on Data Engineering (ICDE). 2020

AutoSF The code for our paper "AutoSF: Searching Scoring Functions for Knowledge Graph Embedding" and this paper has been accepted by ICDE2020. News:

64 Dec 17, 2022

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

107 Dec 2, 2022

"Inductive Entity Representations from Text via Link Prediction" @ The Web Conference 2021

Inductive entity representations from text via link prediction This repository contains the code used for the experiments in the paper "Inductive enti

45 Jan 9, 2023

Github for the conference paper GLOD-Gaussian Likelihood OOD detector

FOOD - Fast OOD Detector Pytorch implamentation of the confernce peper FOOD arxiv link. Abstract Deep neural networks (DNNs) perform well at classifyi

17 Jun 19, 2022

Abstractive opinion summarization system (SelSum) and the largest dataset of Amazon product summaries (AmaSum). EMNLP 2021 conference paper.

Learning Opinion Summarizers by Selecting Informative Reviews This repository contains the codebase and the dataset for the corresponding EMNLP 2021

39 Jan 1, 2023

Ratatoskr: Worcester Tech's conference scheduling system

Ratatoskr: Worcester Tech's conference scheduling system In Norse mythology, Ratatoskr is a squirrel who runs up and down the world tree Yggdrasil to

4 Dec 22, 2022

The official implementation for ACL 2021 "Challenges in Information Seeking QA: Unanswerable Questions and Paragraph Retrieval".

Code for "Challenges in Information Seeking QA: Unanswerable Questions and Paragraph Retrieval" (ACL 2021, Long) This is the repository for baseline m

25 Oct 30, 2022

Comments

Problem with proper data loading

Hi, I'd like to run your model by myself, however I cannot find proper way to load the dataset with .mp3 files you provided. Is there a chance to share the dataloader you've used or give some hints how to process the .mp3 files to valid dataset which could be used in your usage examples? I'll be very grateful!

opened by pstryczke 9
关于NSVB

听了demo后有些疑问， 1 如果实际使用来美化唱歌，那么Inference的时候是需要原唱的pitch curve对吧？ 2 虽然测试样例不在训练样本中，但是GT Professional和GT Amateur是同一个人录制的。Inference中GT Professional不可能是自己，这样泛化性有测试过吗？

opened by suzhenghang 0
hi, request for datasets and source code.

This work is very outstanding and we are insterested in it. Are there any plans to make the dataset and associated pretrained models public in the near future? Thank you

opened by hertz-pj 0

Learning the Beauty in Songs: Neural Singing Voice Beautifier; ACL 2022 (Main conference); Official code

Related tags

Overview

Learning the Beauty in Songs: Neural Singing Voice Beautifier

Abstract

Issues

You might also like...

Code for the paper "Balancing Training for Multilingual Neural Machine Translation, ACL 2020"

[CVPR 2022] Official code for the paper: "A Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved Neural Network Calibration"

Y. Zhang, Q. Yao, W. Dai, L. Chen. AutoSF: Searching Scoring Functions for Knowledge Graph Embedding. IEEE International Conference on Data Engineering (ICDE). 2020

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

"Inductive Entity Representations from Text via Link Prediction" @ The Web Conference 2021

Github for the conference paper GLOD-Gaussian Likelihood OOD detector

Abstractive opinion summarization system (SelSum) and the largest dataset of Amazon product summaries (AmaSum). EMNLP 2021 conference paper.

Ratatoskr: Worcester Tech's conference scheduling system

The official implementation for ACL 2021 "Challenges in Information Seeking QA: Unanswerable Questions and Paragraph Retrieval".

Comments

Problem with proper data loading

关于NSVB

hi, request for datasets and source code.

Releases(pre-release)

pre-release(May 27, 2022)

Owner

Jinglin Liu

Implemented fully documented Particle Swarm Optimization algorithm (basic model with few advanced features) using Python programming language

Implementation of light baking system for ray tracing based on Activision's UberBake

This repository holds code and data for our PETS'22 article 'From "Onion Not Found" to Guard Discovery'.

Pytorch implementation of Integrating Tree Path in Transformer for Code Representation

Sample code and notebooks for Vertex AI, the end-to-end machine learning platform on Google Cloud

Code for pre-training CharacterBERT models (as well as BERT models).

Infrastructure as Code (IaC) for a self-hosted version of Gnosis Safe on AWS

ML-Decoder: Scalable and Versatile Classification Head

Official implementation of YOGO for Point-Cloud Processing

CVPR 2021: "The Spatially-Correlative Loss for Various Image Translation Tasks"

Performance Analysis of Multi-user NOMA Wireless-Powered mMTC Networks: A Stochastic Geometry Approach

This code is a toolbox that uses Torch library for training and evaluating the ERFNet architecture for semantic segmentation.

Official implementation of VQ-Diffusion

NUANCED is a user-centric conversational recommendation dataset that contains 5.1k annotated dialogues and 26k high-quality user turns.

(IEEE TIP 2021) Regularized Densely-connected Pyramid Network for Salient Instance Segmentation

Answering Open-Domain Questions of Varying Reasoning Steps from Text

Course materials for Fall 2021 "CIS6930 Topics in Computing for Data Science" at New College of Florida

A framework for analyzing computer vision models with simulated data

MinkLoc++: Lidar and Monocular Image Fusion for Place Recognition

This program was designed to detect whether someone is wearing a facemask through a live video stream.