Alert button
Picture for Yuxuan Wang

Yuxuan Wang

Alert button

PolyVoice: Language Models for Speech to Speech Translation

Jun 13, 2023
Qianqian Dong, Zhiying Huang, Qiao Tian, Chen Xu, Tom Ko, Yunlong Zhao, Siyuan Feng, Tang Li, Kexin Wang, Xuxin Cheng, Fengpeng Yue, Ye Bai, Xi Chen, Lu Lu, Zejun Ma, Yuping Wang, Mingxuan Wang, Yuxuan Wang

Figure 1 for PolyVoice: Language Models for Speech to Speech Translation
Figure 2 for PolyVoice: Language Models for Speech to Speech Translation
Figure 3 for PolyVoice: Language Models for Speech to Speech Translation
Figure 4 for PolyVoice: Language Models for Speech to Speech Translation
Viaarxiv icon

Query Encoder Distillation via Embedding Alignment is a Strong Baseline Method to Boost Dense Retriever Online Efficiency

Jun 05, 2023
Yuxuan Wang, Hong Lyu

Figure 1 for Query Encoder Distillation via Embedding Alignment is a Strong Baseline Method to Boost Dense Retriever Online Efficiency
Figure 2 for Query Encoder Distillation via Embedding Alignment is a Strong Baseline Method to Boost Dense Retriever Online Efficiency
Figure 3 for Query Encoder Distillation via Embedding Alignment is a Strong Baseline Method to Boost Dense Retriever Online Efficiency
Figure 4 for Query Encoder Distillation via Embedding Alignment is a Strong Baseline Method to Boost Dense Retriever Online Efficiency
Viaarxiv icon

MoviePuzzle: Visual Narrative Reasoning through Multimodal Order Learning

Jun 04, 2023
Jianghui Wang, Yuxuan Wang, Dongyan Zhao, Zilong Zheng

Figure 1 for MoviePuzzle: Visual Narrative Reasoning through Multimodal Order Learning
Figure 2 for MoviePuzzle: Visual Narrative Reasoning through Multimodal Order Learning
Figure 3 for MoviePuzzle: Visual Narrative Reasoning through Multimodal Order Learning
Figure 4 for MoviePuzzle: Visual Narrative Reasoning through Multimodal Order Learning
Viaarxiv icon

Shuo Wen Jie Zi: Rethinking Dictionaries and Glyphs for Chinese Language Pre-training

May 30, 2023
Yuxuan Wang, Jianghui Wang, Dongyan Zhao, Zilong Zheng

Figure 1 for Shuo Wen Jie Zi: Rethinking Dictionaries and Glyphs for Chinese Language Pre-training
Figure 2 for Shuo Wen Jie Zi: Rethinking Dictionaries and Glyphs for Chinese Language Pre-training
Figure 3 for Shuo Wen Jie Zi: Rethinking Dictionaries and Glyphs for Chinese Language Pre-training
Figure 4 for Shuo Wen Jie Zi: Rethinking Dictionaries and Glyphs for Chinese Language Pre-training
Viaarxiv icon

VSTAR: A Video-grounded Dialogue Dataset for Situated Semantic Understanding with Scene and Topic Transitions

May 30, 2023
Yuxuan Wang, Zilong Zheng, Xueliang Zhao, Jinpeng Li, Yueqian Wang, Dongyan Zhao

Figure 1 for VSTAR: A Video-grounded Dialogue Dataset for Situated Semantic Understanding with Scene and Topic Transitions
Figure 2 for VSTAR: A Video-grounded Dialogue Dataset for Situated Semantic Understanding with Scene and Topic Transitions
Figure 3 for VSTAR: A Video-grounded Dialogue Dataset for Situated Semantic Understanding with Scene and Topic Transitions
Figure 4 for VSTAR: A Video-grounded Dialogue Dataset for Situated Semantic Understanding with Scene and Topic Transitions
Viaarxiv icon

Efficient Neural Music Generation

May 25, 2023
Max W. Y. Lam, Qiao Tian, Tang Li, Zongyu Yin, Siyuan Feng, Ming Tu, Yuliang Ji, Rui Xia, Mingbo Ma, Xuchen Song, Jitong Chen, Yuping Wang, Yuxuan Wang

Figure 1 for Efficient Neural Music Generation
Figure 2 for Efficient Neural Music Generation
Figure 3 for Efficient Neural Music Generation
Figure 4 for Efficient Neural Music Generation
Viaarxiv icon

Language-universal phonetic encoder for low-resource speech recognition

May 19, 2023
Siyuan Feng, Ming Tu, Rui Xia, Chuanzeng Huang, Yuxuan Wang

Figure 1 for Language-universal phonetic encoder for low-resource speech recognition
Figure 2 for Language-universal phonetic encoder for low-resource speech recognition
Figure 3 for Language-universal phonetic encoder for low-resource speech recognition
Figure 4 for Language-universal phonetic encoder for low-resource speech recognition
Viaarxiv icon

Language-Universal Phonetic Representation in Multilingual Speech Pretraining for Low-Resource Speech Recognition

May 19, 2023
Siyuan Feng, Ming Tu, Rui Xia, Chuanzeng Huang, Yuxuan Wang

Figure 1 for Language-Universal Phonetic Representation in Multilingual Speech Pretraining for Low-Resource Speech Recognition
Figure 2 for Language-Universal Phonetic Representation in Multilingual Speech Pretraining for Low-Resource Speech Recognition
Figure 3 for Language-Universal Phonetic Representation in Multilingual Speech Pretraining for Low-Resource Speech Recognition
Figure 4 for Language-Universal Phonetic Representation in Multilingual Speech Pretraining for Low-Resource Speech Recognition
Viaarxiv icon

a unified front-end framework for english text-to-speech synthesis

May 18, 2023
Zelin Ying, Chen Li, Yu Dong, Qiuqiang Kong, YuanYuan Huo, Yuping Wang, Yuxuan Wang

Figure 1 for a unified front-end framework for english text-to-speech synthesis
Figure 2 for a unified front-end framework for english text-to-speech synthesis
Figure 3 for a unified front-end framework for english text-to-speech synthesis
Figure 4 for a unified front-end framework for english text-to-speech synthesis
Viaarxiv icon

Joint Multi-scale Cross-lingual Speaking Style Transfer with Bidirectional Attention Mechanism for Automatic Dubbing

May 09, 2023
Jingbei Li, Sipan Li, Ping Chen, Luwen Zhang, Yi Meng, Zhiyong Wu, Helen Meng, Qiao Tian, Yuping Wang, Yuxuan Wang

Figure 1 for Joint Multi-scale Cross-lingual Speaking Style Transfer with Bidirectional Attention Mechanism for Automatic Dubbing
Figure 2 for Joint Multi-scale Cross-lingual Speaking Style Transfer with Bidirectional Attention Mechanism for Automatic Dubbing
Figure 3 for Joint Multi-scale Cross-lingual Speaking Style Transfer with Bidirectional Attention Mechanism for Automatic Dubbing
Figure 4 for Joint Multi-scale Cross-lingual Speaking Style Transfer with Bidirectional Attention Mechanism for Automatic Dubbing
Viaarxiv icon