Picture for Zhenhui Ye

Zhenhui Ye

FreeBind: Free Lunch in Unified Multimodal Space via Knowledge Fusion

Add code
May 10, 2024
Figure 1 for FreeBind: Free Lunch in Unified Multimodal Space via Knowledge Fusion
Figure 2 for FreeBind: Free Lunch in Unified Multimodal Space via Knowledge Fusion
Figure 3 for FreeBind: Free Lunch in Unified Multimodal Space via Knowledge Fusion
Figure 4 for FreeBind: Free Lunch in Unified Multimodal Space via Knowledge Fusion
Viaarxiv icon

Molecule-Space: Free Lunch in Unified Multimodal Space via Knowledge Fusion

Add code
May 08, 2024
Figure 1 for Molecule-Space: Free Lunch in Unified Multimodal Space via Knowledge Fusion
Figure 2 for Molecule-Space: Free Lunch in Unified Multimodal Space via Knowledge Fusion
Figure 3 for Molecule-Space: Free Lunch in Unified Multimodal Space via Knowledge Fusion
Figure 4 for Molecule-Space: Free Lunch in Unified Multimodal Space via Knowledge Fusion
Viaarxiv icon

Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis

Add code
Jan 20, 2024
Figure 1 for Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis
Figure 2 for Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis
Figure 3 for Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis
Figure 4 for Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis
Viaarxiv icon

Mega-TTS 2: Zero-Shot Text-to-Speech with Arbitrary Length Speech Prompts

Add code
Jul 14, 2023
Figure 1 for Mega-TTS 2: Zero-Shot Text-to-Speech with Arbitrary Length Speech Prompts
Figure 2 for Mega-TTS 2: Zero-Shot Text-to-Speech with Arbitrary Length Speech Prompts
Figure 3 for Mega-TTS 2: Zero-Shot Text-to-Speech with Arbitrary Length Speech Prompts
Figure 4 for Mega-TTS 2: Zero-Shot Text-to-Speech with Arbitrary Length Speech Prompts
Viaarxiv icon

Ada-TTA: Towards Adaptive High-Quality Text-to-Talking Avatar Synthesis

Add code
Jun 06, 2023
Figure 1 for Ada-TTA: Towards Adaptive High-Quality Text-to-Talking Avatar Synthesis
Figure 2 for Ada-TTA: Towards Adaptive High-Quality Text-to-Talking Avatar Synthesis
Figure 3 for Ada-TTA: Towards Adaptive High-Quality Text-to-Talking Avatar Synthesis
Figure 4 for Ada-TTA: Towards Adaptive High-Quality Text-to-Talking Avatar Synthesis
Viaarxiv icon

Mega-TTS: Zero-Shot Text-to-Speech at Scale with Intrinsic Inductive Bias

Add code
Jun 06, 2023
Figure 1 for Mega-TTS: Zero-Shot Text-to-Speech at Scale with Intrinsic Inductive Bias
Figure 2 for Mega-TTS: Zero-Shot Text-to-Speech at Scale with Intrinsic Inductive Bias
Figure 3 for Mega-TTS: Zero-Shot Text-to-Speech at Scale with Intrinsic Inductive Bias
Figure 4 for Mega-TTS: Zero-Shot Text-to-Speech at Scale with Intrinsic Inductive Bias
Viaarxiv icon

Make-A-Voice: Unified Voice Synthesis With Discrete Representation

Add code
May 30, 2023
Figure 1 for Make-A-Voice: Unified Voice Synthesis With Discrete Representation
Figure 2 for Make-A-Voice: Unified Voice Synthesis With Discrete Representation
Figure 3 for Make-A-Voice: Unified Voice Synthesis With Discrete Representation
Figure 4 for Make-A-Voice: Unified Voice Synthesis With Discrete Representation
Viaarxiv icon

Make-An-Audio 2: Temporal-Enhanced Text-to-Audio Generation

Add code
May 29, 2023
Figure 1 for Make-An-Audio 2: Temporal-Enhanced Text-to-Audio Generation
Figure 2 for Make-An-Audio 2: Temporal-Enhanced Text-to-Audio Generation
Figure 3 for Make-An-Audio 2: Temporal-Enhanced Text-to-Audio Generation
Figure 4 for Make-An-Audio 2: Temporal-Enhanced Text-to-Audio Generation
Viaarxiv icon

AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation

Add code
May 24, 2023
Figure 1 for AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation
Figure 2 for AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation
Figure 3 for AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation
Figure 4 for AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation
Viaarxiv icon

FluentSpeech: Stutter-Oriented Automatic Speech Editing with Context-Aware Diffusion Models

Add code
May 23, 2023
Figure 1 for FluentSpeech: Stutter-Oriented Automatic Speech Editing with Context-Aware Diffusion Models
Figure 2 for FluentSpeech: Stutter-Oriented Automatic Speech Editing with Context-Aware Diffusion Models
Figure 3 for FluentSpeech: Stutter-Oriented Automatic Speech Editing with Context-Aware Diffusion Models
Figure 4 for FluentSpeech: Stutter-Oriented Automatic Speech Editing with Context-Aware Diffusion Models
Viaarxiv icon