Picture for Jianwu Dang

Jianwu Dang

An Initial Investigation of Language Adaptation for TTS Systems under Low-resource Scenarios

Add code
Jun 13, 2024
Figure 1 for An Initial Investigation of Language Adaptation for TTS Systems under Low-resource Scenarios
Figure 2 for An Initial Investigation of Language Adaptation for TTS Systems under Low-resource Scenarios
Figure 3 for An Initial Investigation of Language Adaptation for TTS Systems under Low-resource Scenarios
Figure 4 for An Initial Investigation of Language Adaptation for TTS Systems under Low-resource Scenarios
Viaarxiv icon

A Refining Underlying Information Framework for Monaural Speech Enhancement

Add code
Dec 24, 2023
Viaarxiv icon

ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representations

Add code
Dec 22, 2023
Viaarxiv icon

Ahpatron: A New Budgeted Online Kernel Learning Machine with Tighter Mistake Bound

Add code
Dec 12, 2023
Figure 1 for Ahpatron: A New Budgeted Online Kernel Learning Machine with Tighter Mistake Bound
Figure 2 for Ahpatron: A New Budgeted Online Kernel Learning Machine with Tighter Mistake Bound
Figure 3 for Ahpatron: A New Budgeted Online Kernel Learning Machine with Tighter Mistake Bound
Figure 4 for Ahpatron: A New Budgeted Online Kernel Learning Machine with Tighter Mistake Bound
Viaarxiv icon

High-Fidelity Speech Synthesis with Minimal Supervision: All Using Diffusion Models

Add code
Sep 27, 2023
Figure 1 for High-Fidelity Speech Synthesis with Minimal Supervision: All Using Diffusion Models
Figure 2 for High-Fidelity Speech Synthesis with Minimal Supervision: All Using Diffusion Models
Figure 3 for High-Fidelity Speech Synthesis with Minimal Supervision: All Using Diffusion Models
Viaarxiv icon

Learning Speech Representation From Contrastive Token-Acoustic Pretraining

Add code
Sep 06, 2023
Figure 1 for Learning Speech Representation From Contrastive Token-Acoustic Pretraining
Figure 2 for Learning Speech Representation From Contrastive Token-Acoustic Pretraining
Viaarxiv icon

Minimally-Supervised Speech Synthesis with Conditional Diffusion Model and Language Model: A Comparative Study of Semantic Coding

Add code
Jul 28, 2023
Figure 1 for Minimally-Supervised Speech Synthesis with Conditional Diffusion Model and Language Model: A Comparative Study of Semantic Coding
Figure 2 for Minimally-Supervised Speech Synthesis with Conditional Diffusion Model and Language Model: A Comparative Study of Semantic Coding
Figure 3 for Minimally-Supervised Speech Synthesis with Conditional Diffusion Model and Language Model: A Comparative Study of Semantic Coding
Figure 4 for Minimally-Supervised Speech Synthesis with Conditional Diffusion Model and Language Model: A Comparative Study of Semantic Coding
Viaarxiv icon

Rethinking the visual cues in audio-visual speaker extraction

Add code
Jun 05, 2023
Figure 1 for Rethinking the visual cues in audio-visual speaker extraction
Figure 2 for Rethinking the visual cues in audio-visual speaker extraction
Figure 3 for Rethinking the visual cues in audio-visual speaker extraction
Viaarxiv icon

speech and noise dual-stream spectrogram refine network with speech distortion loss for robust speech recognition

Add code
May 30, 2023
Figure 1 for speech and noise dual-stream spectrogram refine network with speech distortion loss for robust speech recognition
Figure 2 for speech and noise dual-stream spectrogram refine network with speech distortion loss for robust speech recognition
Figure 3 for speech and noise dual-stream spectrogram refine network with speech distortion loss for robust speech recognition
Figure 4 for speech and noise dual-stream spectrogram refine network with speech distortion loss for robust speech recognition
Viaarxiv icon

Locate and Beamform: Two-dimensional Locating All-neural Beamformer for Multi-channel Speech Separation

Add code
May 18, 2023
Figure 1 for Locate and Beamform: Two-dimensional Locating All-neural Beamformer for Multi-channel Speech Separation
Figure 2 for Locate and Beamform: Two-dimensional Locating All-neural Beamformer for Multi-channel Speech Separation
Figure 3 for Locate and Beamform: Two-dimensional Locating All-neural Beamformer for Multi-channel Speech Separation
Figure 4 for Locate and Beamform: Two-dimensional Locating All-neural Beamformer for Multi-channel Speech Separation
Viaarxiv icon