Picture for Yi Ren

Yi Ren

Arizona State University

GenerTTS: Pronunciation Disentanglement for Timbre and Style Generalization in Cross-Lingual Text-to-Speech

Add code
Jun 27, 2023
Figure 1 for GenerTTS: Pronunciation Disentanglement for Timbre and Style Generalization in Cross-Lingual Text-to-Speech
Figure 2 for GenerTTS: Pronunciation Disentanglement for Timbre and Style Generalization in Cross-Lingual Text-to-Speech
Figure 3 for GenerTTS: Pronunciation Disentanglement for Timbre and Style Generalization in Cross-Lingual Text-to-Speech
Figure 4 for GenerTTS: Pronunciation Disentanglement for Timbre and Style Generalization in Cross-Lingual Text-to-Speech
Viaarxiv icon

Ada-TTA: Towards Adaptive High-Quality Text-to-Talking Avatar Synthesis

Add code
Jun 06, 2023
Figure 1 for Ada-TTA: Towards Adaptive High-Quality Text-to-Talking Avatar Synthesis
Figure 2 for Ada-TTA: Towards Adaptive High-Quality Text-to-Talking Avatar Synthesis
Figure 3 for Ada-TTA: Towards Adaptive High-Quality Text-to-Talking Avatar Synthesis
Figure 4 for Ada-TTA: Towards Adaptive High-Quality Text-to-Talking Avatar Synthesis
Viaarxiv icon

Mega-TTS: Zero-Shot Text-to-Speech at Scale with Intrinsic Inductive Bias

Add code
Jun 06, 2023
Viaarxiv icon

Detector Guidance for Multi-Object Text-to-Image Generation

Add code
Jun 04, 2023
Figure 1 for Detector Guidance for Multi-Object Text-to-Image Generation
Figure 2 for Detector Guidance for Multi-Object Text-to-Image Generation
Figure 3 for Detector Guidance for Multi-Object Text-to-Image Generation
Figure 4 for Detector Guidance for Multi-Object Text-to-Image Generation
Viaarxiv icon

StyleS2ST: Zero-shot Style Transfer for Direct Speech-to-speech Translation

Add code
Jun 01, 2023
Viaarxiv icon

Make-An-Audio 2: Temporal-Enhanced Text-to-Audio Generation

Add code
May 29, 2023
Figure 1 for Make-An-Audio 2: Temporal-Enhanced Text-to-Audio Generation
Figure 2 for Make-An-Audio 2: Temporal-Enhanced Text-to-Audio Generation
Figure 3 for Make-An-Audio 2: Temporal-Enhanced Text-to-Audio Generation
Figure 4 for Make-An-Audio 2: Temporal-Enhanced Text-to-Audio Generation
Viaarxiv icon

AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation

Add code
May 24, 2023
Figure 1 for AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation
Figure 2 for AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation
Figure 3 for AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation
Figure 4 for AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation
Viaarxiv icon

FluentSpeech: Stutter-Oriented Automatic Speech Editing with Context-Aware Diffusion Models

Add code
May 23, 2023
Viaarxiv icon

CLAPSpeech: Learning Prosody from Text Context with Contrastive Language-Audio Pre-training

Add code
May 18, 2023
Viaarxiv icon

GeneFace++: Generalized and Stable Real-Time Audio-Driven 3D Talking Face Generation

Add code
May 01, 2023
Figure 1 for GeneFace++: Generalized and Stable Real-Time Audio-Driven 3D Talking Face Generation
Figure 2 for GeneFace++: Generalized and Stable Real-Time Audio-Driven 3D Talking Face Generation
Figure 3 for GeneFace++: Generalized and Stable Real-Time Audio-Driven 3D Talking Face Generation
Figure 4 for GeneFace++: Generalized and Stable Real-Time Audio-Driven 3D Talking Face Generation
Viaarxiv icon