Picture for Ziyue Jiang

Ziyue Jiang

MSceneSpeech: A Multi-Scene Speech Dataset For Expressive Speech Synthesis

Add code
Jul 19, 2024
Viaarxiv icon

ControlSpeech: Towards Simultaneous Zero-shot Speaker Cloning and Zero-shot Language Style Control With Decoupled Codec

Add code
Jun 03, 2024
Viaarxiv icon

Language-Codec: Reducing the Gaps Between Discrete Codec Representation and Speech Language Models

Add code
Feb 20, 2024
Viaarxiv icon

MobileSpeech: A Fast and High-Fidelity Framework for Mobile Zero-Shot Text-to-Speech

Add code
Feb 14, 2024
Figure 1 for MobileSpeech: A Fast and High-Fidelity Framework for Mobile Zero-Shot Text-to-Speech
Figure 2 for MobileSpeech: A Fast and High-Fidelity Framework for Mobile Zero-Shot Text-to-Speech
Figure 3 for MobileSpeech: A Fast and High-Fidelity Framework for Mobile Zero-Shot Text-to-Speech
Figure 4 for MobileSpeech: A Fast and High-Fidelity Framework for Mobile Zero-Shot Text-to-Speech
Viaarxiv icon

AIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension

Add code
Feb 12, 2024
Viaarxiv icon

Zero-shot Explainable Mental Health Analysis on Social Media by incorporating Mental Scales

Add code
Feb 09, 2024
Viaarxiv icon

Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis

Add code
Jan 20, 2024
Figure 1 for Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis
Figure 2 for Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis
Figure 3 for Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis
Figure 4 for Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis
Viaarxiv icon

FluentEditor: Text-based Speech Editing by Considering Acoustic and Prosody Consistency

Add code
Sep 22, 2023
Figure 1 for FluentEditor: Text-based Speech Editing by Considering Acoustic and Prosody Consistency
Figure 2 for FluentEditor: Text-based Speech Editing by Considering Acoustic and Prosody Consistency
Figure 3 for FluentEditor: Text-based Speech Editing by Considering Acoustic and Prosody Consistency
Figure 4 for FluentEditor: Text-based Speech Editing by Considering Acoustic and Prosody Consistency
Viaarxiv icon

TextrolSpeech: A Text Style Control Speech Corpus With Codec Language Text-to-Speech Models

Add code
Aug 28, 2023
Figure 1 for TextrolSpeech: A Text Style Control Speech Corpus With Codec Language Text-to-Speech Models
Figure 2 for TextrolSpeech: A Text Style Control Speech Corpus With Codec Language Text-to-Speech Models
Figure 3 for TextrolSpeech: A Text Style Control Speech Corpus With Codec Language Text-to-Speech Models
Figure 4 for TextrolSpeech: A Text Style Control Speech Corpus With Codec Language Text-to-Speech Models
Viaarxiv icon

Mega-TTS 2: Zero-Shot Text-to-Speech with Arbitrary Length Speech Prompts

Add code
Jul 14, 2023
Figure 1 for Mega-TTS 2: Zero-Shot Text-to-Speech with Arbitrary Length Speech Prompts
Figure 2 for Mega-TTS 2: Zero-Shot Text-to-Speech with Arbitrary Length Speech Prompts
Figure 3 for Mega-TTS 2: Zero-Shot Text-to-Speech with Arbitrary Length Speech Prompts
Figure 4 for Mega-TTS 2: Zero-Shot Text-to-Speech with Arbitrary Length Speech Prompts
Viaarxiv icon