Alert button
Picture for Shengpeng Ji

Shengpeng Ji

Alert button

Unlocking the Potential of Multimodal Unified Discrete Representation through Training-Free Codebook Optimization and Hierarchical Alignment

Add code
Bookmark button
Alert button
Mar 08, 2024
Hai Huang, Yan Xia, Shengpeng Ji, Shulei Wang, Hanting Wang, Jieming Zhu, Zhenhua Dong, Zhou Zhao

Figure 1 for Unlocking the Potential of Multimodal Unified Discrete Representation through Training-Free Codebook Optimization and Hierarchical Alignment
Figure 2 for Unlocking the Potential of Multimodal Unified Discrete Representation through Training-Free Codebook Optimization and Hierarchical Alignment
Figure 3 for Unlocking the Potential of Multimodal Unified Discrete Representation through Training-Free Codebook Optimization and Hierarchical Alignment
Figure 4 for Unlocking the Potential of Multimodal Unified Discrete Representation through Training-Free Codebook Optimization and Hierarchical Alignment
Viaarxiv icon

Language-Codec: Reducing the Gaps Between Discrete Codec Representation and Speech Language Models

Add code
Bookmark button
Alert button
Feb 20, 2024
Shengpeng Ji, Minghui Fang, Ziyue Jiang, Rongjie Huang, Jialung Zuo, Shulei Wang, Zhou Zhao

Viaarxiv icon

MobileSpeech: A Fast and High-Fidelity Framework for Mobile Zero-Shot Text-to-Speech

Add code
Bookmark button
Alert button
Feb 14, 2024
Shengpeng Ji, Ziyue Jiang, Hanting Wang, Jialong Zuo, Zhou Zhao

Viaarxiv icon

TextrolSpeech: A Text Style Control Speech Corpus With Codec Language Text-to-Speech Models

Add code
Bookmark button
Alert button
Aug 28, 2023
Shengpeng Ji, Jialong Zuo, Minghui Fang, Ziyue Jiang, Feiyang Chen, Xinyu Duan, Baoxing Huai, Zhou Zhao

Figure 1 for TextrolSpeech: A Text Style Control Speech Corpus With Codec Language Text-to-Speech Models
Figure 2 for TextrolSpeech: A Text Style Control Speech Corpus With Codec Language Text-to-Speech Models
Figure 3 for TextrolSpeech: A Text Style Control Speech Corpus With Codec Language Text-to-Speech Models
Figure 4 for TextrolSpeech: A Text Style Control Speech Corpus With Codec Language Text-to-Speech Models
Viaarxiv icon

Mega-TTS: Zero-Shot Text-to-Speech at Scale with Intrinsic Inductive Bias

Add code
Bookmark button
Alert button
Jun 06, 2023
Ziyue Jiang, Yi Ren, Zhenhui Ye, Jinglin Liu, Chen Zhang, Qian Yang, Shengpeng Ji, Rongjie Huang, Chunfeng Wang, Xiang Yin, Zejun Ma, Zhou Zhao

Figure 1 for Mega-TTS: Zero-Shot Text-to-Speech at Scale with Intrinsic Inductive Bias
Figure 2 for Mega-TTS: Zero-Shot Text-to-Speech at Scale with Intrinsic Inductive Bias
Figure 3 for Mega-TTS: Zero-Shot Text-to-Speech at Scale with Intrinsic Inductive Bias
Figure 4 for Mega-TTS: Zero-Shot Text-to-Speech at Scale with Intrinsic Inductive Bias
Viaarxiv icon