Picture for Longbiao Wang

Longbiao Wang

An Initial Investigation of Language Adaptation for TTS Systems under Low-resource Scenarios

Add code
Jun 13, 2024
Figure 1 for An Initial Investigation of Language Adaptation for TTS Systems under Low-resource Scenarios
Figure 2 for An Initial Investigation of Language Adaptation for TTS Systems under Low-resource Scenarios
Figure 3 for An Initial Investigation of Language Adaptation for TTS Systems under Low-resource Scenarios
Figure 4 for An Initial Investigation of Language Adaptation for TTS Systems under Low-resource Scenarios
Viaarxiv icon

ICMC-ASR: The ICASSP 2024 In-Car Multi-Channel Automatic Speech Recognition Challenge

Add code
Jan 07, 2024
Viaarxiv icon

A Refining Underlying Information Framework for Monaural Speech Enhancement

Add code
Dec 24, 2023
Viaarxiv icon

ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representations

Add code
Dec 22, 2023
Viaarxiv icon

Multi-Level Knowledge Distillation for Speech Emotion Recognition in Noisy Conditions

Add code
Dec 21, 2023
Viaarxiv icon

High-Fidelity Speech Synthesis with Minimal Supervision: All Using Diffusion Models

Add code
Sep 27, 2023
Figure 1 for High-Fidelity Speech Synthesis with Minimal Supervision: All Using Diffusion Models
Figure 2 for High-Fidelity Speech Synthesis with Minimal Supervision: All Using Diffusion Models
Figure 3 for High-Fidelity Speech Synthesis with Minimal Supervision: All Using Diffusion Models
Viaarxiv icon

Learning Speech Representation From Contrastive Token-Acoustic Pretraining

Add code
Sep 06, 2023
Figure 1 for Learning Speech Representation From Contrastive Token-Acoustic Pretraining
Figure 2 for Learning Speech Representation From Contrastive Token-Acoustic Pretraining
Viaarxiv icon

Minimally-Supervised Speech Synthesis with Conditional Diffusion Model and Language Model: A Comparative Study of Semantic Coding

Add code
Jul 28, 2023
Figure 1 for Minimally-Supervised Speech Synthesis with Conditional Diffusion Model and Language Model: A Comparative Study of Semantic Coding
Figure 2 for Minimally-Supervised Speech Synthesis with Conditional Diffusion Model and Language Model: A Comparative Study of Semantic Coding
Figure 3 for Minimally-Supervised Speech Synthesis with Conditional Diffusion Model and Language Model: A Comparative Study of Semantic Coding
Figure 4 for Minimally-Supervised Speech Synthesis with Conditional Diffusion Model and Language Model: A Comparative Study of Semantic Coding
Viaarxiv icon

Rethinking the visual cues in audio-visual speaker extraction

Add code
Jun 05, 2023
Figure 1 for Rethinking the visual cues in audio-visual speaker extraction
Figure 2 for Rethinking the visual cues in audio-visual speaker extraction
Figure 3 for Rethinking the visual cues in audio-visual speaker extraction
Viaarxiv icon

speech and noise dual-stream spectrogram refine network with speech distortion loss for robust speech recognition

Add code
May 30, 2023
Figure 1 for speech and noise dual-stream spectrogram refine network with speech distortion loss for robust speech recognition
Figure 2 for speech and noise dual-stream spectrogram refine network with speech distortion loss for robust speech recognition
Figure 3 for speech and noise dual-stream spectrogram refine network with speech distortion loss for robust speech recognition
Figure 4 for speech and noise dual-stream spectrogram refine network with speech distortion loss for robust speech recognition
Viaarxiv icon