Pytorch implementation of "MOSNet: Deep Learning based Objective Assessment for Voice Conversion"

Overview

MOSNet

pytorch implementation of "MOSNet: Deep Learning based Objective Assessment for Voice Conversion" https://arxiv.org/abs/1904.08352

Dependency

Linux Ubuntu 20.04

  • GPU: GeForce RTX 2080 Ti
  • CUDA version: 10.0

Python 3.7

  • pytorch==1.4.0
  • numpy==1.19.5
  • tqdm
  • scipy==1.6.2
  • pandas==1.2.4
  • matplotlib
  • librosa==0.6.0

Usage

Reproducing results in the paper

  1. cd ./data and run bash download.sh to download the VCC2018 evaluation results and submitted speech. (downsample the submitted speech might take some times)
  2. Run python mos_results_preprocess.py to prepare the evaluation results. (Run python bootsrap_estimation.py to do the bootstrap experiment for intrinsic MOS calculation)
  3. Run python utils.py to extract .wav to .h5
  4. Run python train.py -c config.json to train a CNN-BLSTM version of MOSNet.
  5. Run python test.py -c config.json --epoch BEST_EPOCH --is_fp16 to test a CNN-BLSTM version of MOSNet.

Note

Thanks to the authors of the paper MOSNet and the code is based on their tensorflow implementation https://github.com/lochenchou/MOSNet. However, my workstation will show OOM errors even with BATCH_SIZE=4 under tensorflow2.0 and RTX 2080 Ti. Therefore I implement the code with pytorch. Currently only 7700MiB memory is used when BATCH_SIZE=64. If you find any problem with my code, you can write a issue.

Citation

If you find this work useful in your research, please consider citing:

@inproceedings{mosnet,
  author={Lo, Chen-Chou and Fu, Szu-Wei and Huang, Wen-Chin and Wang, Xin and Yamagishi, Junichi and Tsao, Yu and Wang, Hsin-Min},
  title={MOSNet: Deep Learning based Objective Assessment for Voice Conversion},
  year=2019,
  booktitle={Proc. Interspeech 2019},
}

License

This work is released under MIT License (see LICENSE file for details).

VCC2018 Database & Results

The model is trained on the large listening evaluation results released by the Voice Conversion Challenge 2018.
The listening test results can be downloaded from here
The databases and results (submitted speech) can be downloaded from here

Sequential GCN for Active Learning

Sequential GCN for Active Learning Please cite if using the code: Link to paper. Requirements: python 3.6+ torch 1.0+ pip libraries: tqdm, sklearn, sc

45 Dec 26, 2022
Deep Distributed Control of Port-Hamiltonian Systems

De(e)pendable Distributed Control of Port-Hamiltonian Systems (DeepDisCoPH) This repository is associated to the paper [1] and it contains: The full p

Dependable Control and Decision group - EPFL 3 Aug 17, 2022
Progressive Domain Adaptation for Object Detection

Progressive Domain Adaptation for Object Detection Implementation of our paper Progressive Domain Adaptation for Object Detection, based on pytorch-fa

96 Nov 25, 2022
Companion repository to the paper accepted at the 4th ACM SIGSPATIAL International Workshop on Advances in Resilient and Intelligent Cities

Transfer learning approach to bicycle sharing systems station location planning using OpenStreetMap Companion repository to the paper accepted at the

Politechnika Wrocławska - repozytorium dla informatyków 4 Oct 24, 2022
Official implementation of "Motif-based Graph Self-Supervised Learning forMolecular Property Prediction"

Motif-based Graph Self-Supervised Learning for Molecular Property Prediction Official Pytorch implementation of NeurIPS'21 paper "Motif-based Graph Se

zaixi 71 Dec 20, 2022
🏆 The 1st Place Submission to AICity Challenge 2021 Natural Language-Based Vehicle Retrieval Track (Alibaba-UTS submission)

AI City 2021: Connecting Language and Vision for Natural Language-Based Vehicle Retrieval 🏆 The 1st Place Submission to AICity Challenge 2021 Natural

82 Dec 29, 2022
PyTorch - Python + Nim

Master Release Pytorch - Py + Nim A Nim frontend for pytorch, aiming to be mostly auto-generated and internally using ATen. Because Nim compiles to C+

Giovanni Petrantoni 425 Dec 22, 2022
Official repository for "On Improving Adversarial Transferability of Vision Transformers" (2021)

Improving-Adversarial-Transferability-of-Vision-Transformers Muzammal Naseer, Kanchana Ranasinghe, Salman Khan, Fahad Khan, Fatih Porikli arxiv link A

Muzammal Naseer 47 Dec 02, 2022
Implementation of the paper All Labels Are Not Created Equal: Enhancing Semi-supervision via Label Grouping and Co-training

SemCo The official pytorch implementation of the paper All Labels Are Not Created Equal: Enhancing Semi-supervision via Label Grouping and Co-training

42 Nov 14, 2022
Raster Vision is an open source Python framework for building computer vision models on satellite, aerial, and other large imagery sets

Raster Vision is an open source Python framework for building computer vision models on satellite, aerial, and other large imagery sets (including obl

Azavea 1.7k Dec 22, 2022
App customer segmentation cohort rfm clustering

CUSTOMER SEGMENTATION COHORT RFM CLUSTERING TỔNG QUAN VỀ HỆ THỐNG DỮ LIỆU Nên chuyển qua theme màu dark thì sẽ nhìn đẹp hơn https://customer-segmentat

hieulmsc 3 Dec 18, 2021
Simple STAC Catalogs discovery tool.

STAC Catalog Discovery Simple STAC discovery tool. Just paste the STAC Catalog link and press Enter. Details STAC Discovery tool enables discovering d

Mykola Kozyr 21 Oct 19, 2022
A new test set for ImageNet

ImageNetV2 The ImageNetV2 dataset contains new test data for the ImageNet benchmark. This repository provides associated code for assembling and worki

186 Dec 18, 2022
Using Machine Learning to Test Causal Hypotheses in Conjoint Analysis

Readme File for "Using Machine Learning to Test Causal Hypotheses in Conjoint Analysis" by Ham, Imai, and Janson. (2022) All scripts were written and

0 Jan 27, 2022
Sub-tomogram-Detection - Deep learning based model for Cyro ET Sub-tomogram-Detection

Deep learning based model for Cyro ET Sub-tomogram-Detection High degree of stru

Siddhant Kumar 2 Feb 04, 2022
Constrained Logistic Regression - How to apply specific constraints to logistic regression's coefficients

Constrained Logistic Regression Sample implementation of constructing a logistic regression with given ranges on each of the feature's coefficients (v

1 Dec 29, 2021
Memory-Augmented Model Predictive Control

Memory-Augmented Model Predictive Control This repository hosts the source code for the journal article "Composing MPC with LQR and Neural Networks fo

Fangyu Wu 1 Jun 19, 2022
Paper list of log-based anomaly detection

Paper list of log-based anomaly detection

Weibin Meng 411 Dec 05, 2022
For IBM Quantum Challenge Africa 2021, 9 September (07:00 UTC) - 20 September (23:00 UTC).

IBM Quantum Challenge Africa 2021 To ensure Africa is able to apply quantum computing to solve problems relevant to the continent, the IBM Research La

Qiskit Community 48 Dec 25, 2022
Deep motion transfer

animation-with-keypoint-mask Paper The right most square is the final result. Softmax mask (circles): \ Heatmap mask: \ conda env create -f environmen

9 Nov 01, 2022