TalkNet 2: Non-Autoregressive Depth-Wise Separable Convolutional Model for Speech Synthesis with Explicit Pitch and Duration Prediction.

Last update: Dec 17, 2022

Overview

TalkNet 2 [WIP]

TalkNet 2: Non-Autoregressive Depth-Wise Separable Convolutional Model for Speech Synthesis with Explicit Pitch and Duration Prediction.

Official TalkNet 2 repo here

Citation

@misc{beliaev2021talknet,
      title={TalkNet 2: Non-Autoregressive Depth-Wise Separable Convolutional Model Stanislav Beliaev, Boris Ginsburgfor Speech Synthesis with Explicit Pitch and Duration Prediction}, 
      author={Stanislav Beliaev and Boris Ginsburg},
      year={2021},
      eprint={2104.08189},
      archivePrefix={arXiv},
      primaryClass={eess.AS}
}

Owner

Rishikesh (ऋषिकेश)

GitHub Repository

LVI-SAM: Tightly-coupled Lidar-Visual-Inertial Odometry via Smoothing and Mapping

LVI-SAM This repository contains code for a lidar-visual-inertial odometry and mapping system, which combines the advantages of LIO-SAM and Vins-Mono

1.1k Dec 27, 2022

[CVPR'21] Multi-Modal Fusion Transformer for End-to-End Autonomous Driving

TransFuser This repository contains the code for the CVPR 2021 paper Multi-Modal Fusion Transformer for End-to-End Autonomous Driving. If you find our

695 Jan 05, 2023

Pytorch implementation of Rosca, Mihaela, et al. "Variational Approaches for Auto-Encoding Generative Adversarial Networks."

alpha-GAN Unofficial pytorch implementation of Rosca, Mihaela, et al. "Variational Approaches for Auto-Encoding Generative Adversarial Networks." arXi

78 Dec 08, 2022

Machine Learning Model deployment for Container (TensorFlow Serving)

try_tf_serving ├───dataset │ ├───testing │ │ ├───paper │ │ ├───rock │ │ └───scissors │ └───training │ ├───paper │ ├───rock

5 Jan 07, 2022

A Transformer-Based Siamese Network for Change Detection

ChangeFormer: A Transformer-Based Siamese Network for Change Detection (Under review at IGARSS-2022) Wele Gedara Chaminda Bandara, Vishal M. Patel Her

214 Dec 29, 2022

Back to the Feature: Learning Robust Camera Localization from Pixels to Pose (CVPR 2021)

Back to the Feature with PixLoc We introduce PixLoc, a neural network for end-to-end learning of camera localization from an image and a 3D model via

610 Jan 05, 2023

No Code AI/ML platform

NoCodeAIML No Code AI/ML platform - Community Edition Video credits: Uday Kiran Typical No Code AI/ML Platform will have features like drag and drop,

5 Jan 28, 2022

FMA: A Dataset For Music Analysis

FMA: A Dataset For Music Analysis Michaël Defferrard, Kirell Benzi, Pierre Vandergheynst, Xavier Bresson. International Society for Music Information

1.8k Dec 29, 2022

Deep Unsupervised 3D SfM Face Reconstruction Based on Massive Landmark Bundle Adjustment.

(ACMMM 2021 Oral) SfM Face Reconstruction Based on Massive Landmark Bundle Adjustment This repository shows two tasks: Face landmark detection and Fac

51 Dec 13, 2022

Code for our paper at ECCV 2020: Post-Training Piecewise Linear Quantization for Deep Neural Networks

PWLQ Updates 2020/07/16 - We are working on getting permission from our institution to release our source code. We will release it once we are granted

54 Dec 15, 2022

Unified MultiWOZ evaluation scripts for the context-to-response task.

MultiWOZ Context-to-Response Evaluation Standardized and easy to use Inform, Success, BLEU ~ See the paper ~ Easy-to-use scripts for standardized eval

38 Dec 13, 2022

[ ICCV 2021 Oral ] Our method can estimate camera poses and neural radiance fields jointly when the cameras are initialized at random poses in complex scenarios (outside-in scenes, even with less texture or intense noise )

GNeRF This repository contains official code for the ICCV 2021 paper: GNeRF: GAN-based Neural Radiance Field without Posed Camera. This implementation

191 Dec 26, 2022

TalkNet 2: Non-Autoregressive Depth-Wise Separable Convolutional Model for Speech Synthesis with Explicit Pitch and Duration Prediction.

Related tags

Overview

TalkNet 2 [WIP]

Citation

Owner

Rishikesh (ऋषिकेश)

LVI-SAM: Tightly-coupled Lidar-Visual-Inertial Odometry via Smoothing and Mapping

[CVPR'21] Multi-Modal Fusion Transformer for End-to-End Autonomous Driving

Pytorch implementation of Rosca, Mihaela, et al. "Variational Approaches for Auto-Encoding Generative Adversarial Networks."

Machine Learning Model deployment for Container (TensorFlow Serving)

A Transformer-Based Siamese Network for Change Detection

Back to the Feature: Learning Robust Camera Localization from Pixels to Pose (CVPR 2021)

No Code AI/ML platform

FMA: A Dataset For Music Analysis

Deep Unsupervised 3D SfM Face Reconstruction Based on Massive Landmark Bundle Adjustment.

Code for our paper at ECCV 2020: Post-Training Piecewise Linear Quantization for Deep Neural Networks

Unified MultiWOZ evaluation scripts for the context-to-response task.

[ ICCV 2021 Oral ] Our method can estimate camera poses and neural radiance fields jointly when the cameras are initialized at random poses in complex scenarios (outside-in scenes, even with less texture or intense noise )

Resilient projection-based consensus actor-critic (RPBCAC) algorithm

Speech Recognition using DeepSpeech2.

Prososdy Morph: A python library for manipulating pitch and duration in an algorithmic way, for resynthesizing speech.

Localizing Visual Sounds the Hard Way

A pytorch-based real-time segmentation model for autonomous driving

A concise but complete implementation of CLIP with various experimental improvements from recent papers

Reinforcement-learning - Repository of the class assignment questions for the course on reinforcement learning

Neural style in TensorFlow! 🎨