The implementation of "Optimizing Shoulder to Shoulder: A Coordinated Sub-Band Fusion Model for Real-Time Full-Band Speech Enhancement"

Last update: Dec 02, 2022

Related tags

Deep Learning SF-Net

Overview

SF-Net for fullband SE

This is the repo of the manuscript "Optimizing Shoulder to Shoulder: A Coordinated Sub-Band Fusion Model for Real-Time Full-Band Speech Enhancement", which is submitted to Interspecch 2022. Some audio samples are provided here and the code for GCRN-full, DS-Net-full, CTS-Net-full and the network configuration of SF-Net are released.

Abstract：Due to the high computational complexity to model more frequency bands, it is still intractable to conduct real-time full-band speech enhancement based on deep neural networks. Recent studies typically utilize the compressed perceptually motivated features with relatively low frequency resolution to filter the full-band spectrum by one-stage networks, leading to limited speech quality improvements. In this paper, we propose a coordinated sub-band fusion network for full-band speech enhancement, which aims to recover the low- (0-8 kHz), middle- (8-16 kHz), and high-band (16-24 kHz) in a step-wise manner. Specifically, a dual-stream network is first pretrained to recover the low-band complex spectrum, and another two sub-networks are designed as the middle- and high-band noise suppressors in the magnitude-only domain. To fully capitalize on the information intercommunication, we employ a sub-band interaction module to provide external knowledge guidance across different frequency bands. Extensive experiments show that the proposed method yields consistent performance advantages over state-of-the-art full-band baselines.

The implementation of "Optimizing Shoulder to Shoulder: A Coordinated Sub-Band Fusion Model for Real-Time Full-Band Speech Enhancement"

Related tags

Overview

SF-Net for fullband SE

Demo page of audio samples

System flowchart of SF-Net

Results:

Abaltion study

Comparison with SOTA

Visualization of spectrograms

VB dataset

DNS blind set

Owner

Guochen Yu

NeurIPS 2021, "Fine Samples for Learning with Noisy Labels"

Implementation of the Paper: "Parameterized Hypercomplex Graph Neural Networks for Graph Classification" by Tuan Le, Marco Bertolini, Frank Noé and Djork-Arné Clevert

Generative Exploration and Exploitation - This is an improved version of GENE.

Python parser for DTED data.

Implementation of ReSeg using PyTorch

Code and description for my BSc Project, September 2021

Official source code of paper 'IterMVS: Iterative Probability Estimation for Efficient Multi-View Stereo'

git git《Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking》(CVPR 2021) GitHub:git2] 《Masksembles for Uncertainty Estimation》(CVPR 2021) GitHub:git3]

Materials for upcoming beginner-friendly PyTorch course (work in progress).

code for CVPR paper Zero-shot Instance Segmentation

Implementation for our ICCV 2021 paper: Dual-Camera Super-Resolution with Aligned Attention Modules

Tutorials, assignments, and competitions for MIT Deep Learning related courses.

学习 python3 以来写的一些垃圾玩具……

Calculates JMA (Japan Meteorological Agency) seismic intensity (shindo) scale from acceleration data recorded in NumPy array

A PyTorch implementation of the Transformer model in "Attention is All You Need".

Depth-Aware Video Frame Interpolation (CVPR 2019)

Dense matching library based on PyTorch

Solution to the Weather4cast 2021 challenge

RealTime Emotion Recognizer for Machine Learning Study Jam's demo

Regulatory Instruments for Fair Personalized Pricing.