Picture for Huadai Liu

Huadai Liu

MEDIC: Zero-shot Music Editing with Disentangled Inversion Control

Add code
Jul 18, 2024
Viaarxiv icon

AudioLCM: Text-to-Audio Generation with Latent Consistency Models

Add code
Jun 01, 2024
Viaarxiv icon

AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation

Add code
May 24, 2023
Figure 1 for AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation
Figure 2 for AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation
Figure 3 for AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation
Figure 4 for AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation
Viaarxiv icon

ViT-TTS: Visual Text-to-Speech with Scalable Diffusion Transformer

Add code
May 22, 2023
Figure 1 for ViT-TTS: Visual Text-to-Speech with Scalable Diffusion Transformer
Figure 2 for ViT-TTS: Visual Text-to-Speech with Scalable Diffusion Transformer
Figure 3 for ViT-TTS: Visual Text-to-Speech with Scalable Diffusion Transformer
Figure 4 for ViT-TTS: Visual Text-to-Speech with Scalable Diffusion Transformer
Viaarxiv icon

Wav2SQL: Direct Generalizable Speech-To-SQL Parsing

Add code
May 21, 2023
Figure 1 for Wav2SQL: Direct Generalizable Speech-To-SQL Parsing
Figure 2 for Wav2SQL: Direct Generalizable Speech-To-SQL Parsing
Figure 3 for Wav2SQL: Direct Generalizable Speech-To-SQL Parsing
Figure 4 for Wav2SQL: Direct Generalizable Speech-To-SQL Parsing
Viaarxiv icon

RMSSinger: Realistic-Music-Score based Singing Voice Synthesis

Add code
May 18, 2023
Figure 1 for RMSSinger: Realistic-Music-Score based Singing Voice Synthesis
Figure 2 for RMSSinger: Realistic-Music-Score based Singing Voice Synthesis
Figure 3 for RMSSinger: Realistic-Music-Score based Singing Voice Synthesis
Figure 4 for RMSSinger: Realistic-Music-Score based Singing Voice Synthesis
Viaarxiv icon

ProDiff: Progressive Fast Diffusion Model For High-Quality Text-to-Speech

Add code
Jul 13, 2022
Figure 1 for ProDiff: Progressive Fast Diffusion Model For High-Quality Text-to-Speech
Figure 2 for ProDiff: Progressive Fast Diffusion Model For High-Quality Text-to-Speech
Figure 3 for ProDiff: Progressive Fast Diffusion Model For High-Quality Text-to-Speech
Figure 4 for ProDiff: Progressive Fast Diffusion Model For High-Quality Text-to-Speech
Viaarxiv icon

TranSpeech: Speech-to-Speech Translation With Bilateral Perturbation

Add code
May 25, 2022
Figure 1 for TranSpeech: Speech-to-Speech Translation With Bilateral Perturbation
Figure 2 for TranSpeech: Speech-to-Speech Translation With Bilateral Perturbation
Figure 3 for TranSpeech: Speech-to-Speech Translation With Bilateral Perturbation
Figure 4 for TranSpeech: Speech-to-Speech Translation With Bilateral Perturbation
Viaarxiv icon