Unified Pre-training for Self-Supervised Learning and Supervised Learning for ASR

Overview

UniSpeech

The family of UniSpeech:

UniSpeech (ICML 2021): Unified Pre-training for Self-Supervised Learning and Supervised Learning for ASR

UniSpeech-SAT (ICASSP 2022 Submission): Universal Speech Representation Learning with Speaker Aware Pre-Training

Pre-trained models

We strongly suggest using our UniSpeech-SAT model for speaker related tasks, since it shows very powerful performance on various speaker related benchmarks.

Model Dataset Model
UniSpeech Base 1500 hrs CommonVoice download
UniSpeech Large 1500 hrs CommonVoice download
UniSpeech-SAT Base 960 hrs LibriSpeech download
UniSpeech-SAT Base+ 60k hrs Libri-Light + 10k hrs GigaSpeech + 24k hrs VoxPopuli download
UniSpeech-SAT Large 60k hrs Libri-Light + 10k hrs GigaSpeech + 24k hrs VoxPopuli download

License

This project is licensed under the license found in the LICENSE file in the root directory of this source tree. Portions of the source code are based on the FAIRSEQ project.

Microsoft Open Source Code of Conduct

Contact Information

For help or issues using UniSpeech models, please submit a GitHub issue.

For other communications related to UniSpeech, please contact Yu Wu ([email protected]).

Owner
Microsoft
Open source projects and samples from Microsoft
Microsoft
Densely Connected Search Space for More Flexible Neural Architecture Search (CVPR2020)

DenseNAS The code of the CVPR2020 paper Densely Connected Search Space for More Flexible Neural Architecture Search. Neural architecture search (NAS)

Jamin Fong 291 Nov 18, 2022
Learning to Prompt for Vision-Language Models.

CoOp Paper: Learning to Prompt for Vision-Language Models Authors: Kaiyang Zhou, Jingkang Yang, Chen Change Loy, Ziwei Liu CoOp (Context Optimization)

Kaiyang 679 Jan 04, 2023
The codes and related files to reproduce the results for Image Similarity Challenge Track 2.

The codes and related files to reproduce the results for Image Similarity Challenge Track 2.

Wenhao Wang 89 Jan 02, 2023
Official PyTorch Implementation of paper EAN: Event Adaptive Network for Efficient Action Recognition

Official PyTorch Implementation of paper EAN: Event Adaptive Network for Efficient Action Recognition

TianYuan 27 Nov 07, 2022
This is an unofficial implementation of the paper “Student-Teacher Feature Pyramid Matching for Unsupervised Anomaly Detection”.

This is an unofficial implementation of the paper “Student-Teacher Feature Pyramid Matching for Unsupervised Anomaly Detection”.

haifeng xia 32 Oct 26, 2022
The official homepage of the (outdated) COCO-Stuff 10K dataset.

COCO-Stuff 10K dataset v1.1 (outdated) Holger Caesar, Jasper Uijlings, Vittorio Ferrari Overview Welcome to official homepage of the COCO-Stuff [1] da

Holger Caesar 263 Dec 11, 2022
Reproduces the results of the paper "Finite Basis Physics-Informed Neural Networks (FBPINNs): a scalable domain decomposition approach for solving differential equations".

Finite basis physics-informed neural networks (FBPINNs) This repository reproduces the results of the paper Finite Basis Physics-Informed Neural Netwo

Ben Moseley 65 Dec 28, 2022
Drone-based Joint Density Map Estimation, Localization and Tracking with Space-Time Multi-Scale Attention Network

DroneCrowd Paper Detection, Tracking, and Counting Meets Drones in Crowds: A Benchmark. Introduction This paper proposes a space-time multi-scale atte

VisDrone 98 Nov 16, 2022
Feed forward VQGAN-CLIP model, where the goal is to eliminate the need for optimizing the latent space of VQGAN for each input prompt

Feed forward VQGAN-CLIP model, where the goal is to eliminate the need for optimizing the latent space of VQGAN for each input prompt. This is done by

Mehdi Cherti 135 Dec 30, 2022
This library provides an abstraction to perform Model Versioning using Weight & Biases.

Description This library provides an abstraction to perform Model Versioning using Weight & Biases. Features Version a new trained model Promote a mod

Hector Lopez Almazan 2 Jan 28, 2022
Aligning Latent and Image Spaces to Connect the Unconnectable

About This repo contains the official implementation of the Aligning Latent and Image Spaces to Connect the Unconnectable paper. It is a GAN model whi

Ivan Skorokhodov 203 Jan 03, 2023
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context Code in both PyTorch and TensorFlow

Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context This repository contains the code in both PyTorch and TensorFlow for our paper

Zhilin Yang 3.3k Jan 06, 2023
Look Closer: Bridging Egocentric and Third-Person Views with Transformers for Robotic Manipulation

Look Closer: Bridging Egocentric and Third-Person Views with Transformers for Robotic Manipulation Official PyTorch implementation for the paper Look

Rishabh Jangir 20 Nov 24, 2022
FaceAPI: AI-powered Face Detection & Rotation Tracking, Face Description & Recognition, Age & Gender & Emotion Prediction for Browser and NodeJS using TensorFlow/JS

FaceAPI AI-powered Face Detection & Rotation Tracking, Face Description & Recognition, Age & Gender & Emotion Prediction for Browser and NodeJS using

Vladimir Mandic 395 Dec 29, 2022
Library to enable Bayesian active learning in your research or labeling work.

Bayesian Active Learning (BaaL) BaaL is an active learning library developed at ElementAI. This repository contains techniques and reusable components

ElementAI 687 Dec 25, 2022
A PyTorch Implementation of "SINE: Scalable Incomplete Network Embedding" (ICDM 2018).

Scalable Incomplete Network Embedding ⠀⠀ A PyTorch implementation of Scalable Incomplete Network Embedding (ICDM 2018). Abstract Attributed network em

Benedek Rozemberczki 69 Sep 22, 2022
Neural Cellular Automata + CLIP

🧠 Text-2-Cellular Automata Using Neural Cellular Automata + OpenAI CLIP (Work in progress) Examples Text Prompt: Cthulu is watching cthulu_is_watchin

Mainak Deb 21 Dec 19, 2022
LoveDA: A Remote Sensing Land-Cover Dataset for Domain Adaptive Semantic Segmentation (NeurIPS2021 Benchmark and Dataset Track)

LoveDA: A Remote Sensing Land-Cover Dataset for Domain Adaptive Semantic Segmentation by Junjue Wang, Zhuo Zheng, Ailong Ma, Xiaoyan Lu, and Yanfei Zh

Kingdrone 174 Dec 22, 2022
A PyTorch Implementation of SphereFace.

SphereFace A PyTorch Implementation of SphereFace. The code can be trained on CASIA-Webface and the best accuracy on LFW is 99.22%. SphereFace: Deep H

carwin 685 Dec 09, 2022
Pytorch implementation of MalConv

MalConv-Pytorch A Pytorch implementation of MalConv Desciprtion This is the implementation of MalConv proposed in Malware Detection by Eating a Whole

Alexander H. Liu 58 Oct 26, 2022