Implementation of "StrengthNet: Deep Learning-based Emotion Strength Assessment for Emotional Speech Synthesis"

Last update: Dec 20, 2022

Related tags

Deep Learning StrengthNet

Overview

StrengthNet

Implementation of "StrengthNet: Deep Learning-based Emotion Strength Assessment for Emotional Speech Synthesis"

https://arxiv.org/abs/2110.03156

Dependency

Ubuntu 18.04.5 LTS

GPU: Quadro RTX 6000
Driver version: 450.80.02
CUDA version: 11.0

Python 3.5

tensorflow-gpu 2.0.0b1 (cudnn=7.6.0)
scipy
pandas
matplotlib
librosa

Environment set-up

For example,

conda create -n strengthnet python=3.5
conda activate strengthnet
pip install -r requirements.txt
conda install cudnn=7.6.0

Usage

Run python utils.py to extract .wav to .h5;
Run python train.py to train a CNN-BLSTM based StrengthNet;

Evaluating new samples

Put the waveforms you wish to evaluate in a folder. For example, / /
Run python test.py --rootdir / /

This script will evaluate all the .wav files in / /, and write the results to / / /StrengthNet_result_raw.txt.

By default, the output/strengthnet.h5 pretrained model is used.

Citation

If you find this work useful in your research, please consider citing:

@misc{liu2021strengthnet,
      title={StrengthNet: Deep Learning-based Emotion Strength Assessment for Emotional Speech Synthesis}, 
      author={Rui Liu and Berrak Sisman and Haizhou Li},
      year={2021},
      eprint={2110.03156},
      archivePrefix={arXiv},
      primaryClass={cs.SD}
}

Resources

The ESD corpus is released by the HLT lab, NUS, Singapore.

The strength scores for the English samples of the ESD corpus are available here.

Acknowledgements:

MOSNet: https://github.com/lochenchou/MOSNet

Relative Attributes: Relative Attributes

License

This work is released under MIT License (see LICENSE file for details).

Implementation of "StrengthNet: Deep Learning-based Emotion Strength Assessment for Emotional Speech Synthesis"

Related tags

Overview

StrengthNet

Dependency

Environment set-up

Usage

Evaluating new samples

Citation

Resources

Acknowledgements:

License

Owner

RuiLiu

Deep Learning for Natural Language Processing SS 2021 (TU Darmstadt)

Robustness via Cross-Domain Ensembles

Real-Time Seizure Detection using EEG: A Comprehensive Comparison of Recent Approaches under a Realistic Setting

The repository forked from NVlabs uses our data. (Differentiable rasterization applied to 3D model simplification tasks)

alfred-py: A deep learning utility library for human

GPU Accelerated Non-rigid ICP for surface registration

Analyzes your GitHub Profile and presents you with a report on how likely you are to become the next MLH Fellow!

OcclusionFusion: realtime dynamic 3D reconstruction based on single-view RGB-D

TensorFlow implementation of "Attention is all you need (Transformer)"

Good Semi-Supervised Learning That Requires a Bad GAN

Neural Network Libraries

PyTorch implementation of probabilistic deep forecast applied to air quality.

Official PyTorch implementation of the paper "Likelihood Training of Schrödinger Bridge using Forward-Backward SDEs Theory (SB-FBSDE)"

Implementation of FitVid video prediction model in JAX/Flax.

CONditionals for Ordinal Regression and classification in PyTorch

Robust Instance Segmentation through Reasoning about Multi-Object Occlusion [CVPR 2021]

Genetic Algorithm, Particle Swarm Optimization, Simulated Annealing, Ant Colony Optimization Algorithm,Immune Algorithm, Artificial Fish Swarm Algorithm, Differential Evolution and TSP(Traveling salesman)

Dynamic Visual Reasoning by Learning Differentiable Physics Models from Video and Language (NeurIPS 2021)

A hyperparameter optimization framework

This code uses generative adversarial networks to generate diverse task allocation plans for Multi-agent teams.

Implementation of "StrengthNet: Deep Learning-based Emotion Strength Assessment for Emotional Speech Synthesis"

Related tags

Overview

StrengthNet

Dependency

Environment set-up

Usage

Evaluating new samples

Citation

Resources

Acknowledgements:

License

Owner

RuiLiu

Deep Learning for Natural Language Processing SS 2021 (TU Darmstadt)

Robustness via Cross-Domain Ensembles

Real-Time Seizure Detection using EEG: A Comprehensive Comparison of Recent Approaches under a Realistic Setting

The repository forked from NVlabs uses our data. (Differentiable rasterization applied to 3D model simplification tasks)

alfred-py: A deep learning utility library for **human**

GPU Accelerated Non-rigid ICP for surface registration

Analyzes your GitHub Profile and presents you with a report on how likely you are to become the next MLH Fellow!

OcclusionFusion: realtime dynamic 3D reconstruction based on single-view RGB-D

TensorFlow implementation of "Attention is all you need (Transformer)"

Good Semi-Supervised Learning That Requires a Bad GAN

Neural Network Libraries

PyTorch implementation of probabilistic deep forecast applied to air quality.

Official PyTorch implementation of the paper "Likelihood Training of Schrödinger Bridge using Forward-Backward SDEs Theory (SB-FBSDE)"

Implementation of FitVid video prediction model in JAX/Flax.

CONditionals for Ordinal Regression and classification in PyTorch

Robust Instance Segmentation through Reasoning about Multi-Object Occlusion [CVPR 2021]

Genetic Algorithm, Particle Swarm Optimization, Simulated Annealing, Ant Colony Optimization Algorithm,Immune Algorithm, Artificial Fish Swarm Algorithm, Differential Evolution and TSP(Traveling salesman)

Dynamic Visual Reasoning by Learning Differentiable Physics Models from Video and Language (NeurIPS 2021)

A hyperparameter optimization framework

This code uses generative adversarial networks to generate diverse task allocation plans for Multi-agent teams.

alfred-py: A deep learning utility library for human