Picture for Wangyou Zhang

Wangyou Zhang

Towards Robust Speech Representation Learning for Thousands of Languages

Add code
Jul 02, 2024
Viaarxiv icon

URGENT Challenge: Universality, Robustness, and Generalizability For Speech Enhancement

Add code
Jun 07, 2024
Viaarxiv icon

Beyond Performance Plateaus: A Comprehensive Study on Scalability in Speech Enhancement

Add code
Jun 06, 2024
Viaarxiv icon

SpeechComposer: Unifying Multiple Speech Tasks with Prompt Composition

Add code
Jan 31, 2024
Viaarxiv icon

ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf models

Add code
Jan 30, 2024
Figure 1 for ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf models
Figure 2 for ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf models
Figure 3 for ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf models
Figure 4 for ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf models
Viaarxiv icon

Improving Design of Input Condition Invariant Speech Enhancement

Add code
Jan 25, 2024
Figure 1 for Improving Design of Input Condition Invariant Speech Enhancement
Figure 2 for Improving Design of Input Condition Invariant Speech Enhancement
Figure 3 for Improving Design of Input Condition Invariant Speech Enhancement
Figure 4 for Improving Design of Input Condition Invariant Speech Enhancement
Viaarxiv icon

A Single Speech Enhancement Model Unifying Dereverberation, Denoising, Speaker Counting, Separation, and Extraction

Add code
Oct 12, 2023
Figure 1 for A Single Speech Enhancement Model Unifying Dereverberation, Denoising, Speaker Counting, Separation, and Extraction
Figure 2 for A Single Speech Enhancement Model Unifying Dereverberation, Denoising, Speaker Counting, Separation, and Extraction
Figure 3 for A Single Speech Enhancement Model Unifying Dereverberation, Denoising, Speaker Counting, Separation, and Extraction
Figure 4 for A Single Speech Enhancement Model Unifying Dereverberation, Denoising, Speaker Counting, Separation, and Extraction
Viaarxiv icon

Reproducing Whisper-Style Training Using an Open-Source Toolkit and Publicly Available Data

Add code
Oct 02, 2023
Figure 1 for Reproducing Whisper-Style Training Using an Open-Source Toolkit and Publicly Available Data
Figure 2 for Reproducing Whisper-Style Training Using an Open-Source Toolkit and Publicly Available Data
Figure 3 for Reproducing Whisper-Style Training Using an Open-Source Toolkit and Publicly Available Data
Figure 4 for Reproducing Whisper-Style Training Using an Open-Source Toolkit and Publicly Available Data
Viaarxiv icon

Toward Universal Speech Enhancement for Diverse Input Conditions

Add code
Sep 29, 2023
Figure 1 for Toward Universal Speech Enhancement for Diverse Input Conditions
Figure 2 for Toward Universal Speech Enhancement for Diverse Input Conditions
Figure 3 for Toward Universal Speech Enhancement for Diverse Input Conditions
Figure 4 for Toward Universal Speech Enhancement for Diverse Input Conditions
Viaarxiv icon

Joint Prediction and Denoising for Large-scale Multilingual Self-supervised Learning

Add code
Sep 28, 2023
Figure 1 for Joint Prediction and Denoising for Large-scale Multilingual Self-supervised Learning
Figure 2 for Joint Prediction and Denoising for Large-scale Multilingual Self-supervised Learning
Figure 3 for Joint Prediction and Denoising for Large-scale Multilingual Self-supervised Learning
Figure 4 for Joint Prediction and Denoising for Large-scale Multilingual Self-supervised Learning
Viaarxiv icon