Alert button
Picture for Shujie Liu

Shujie Liu

Alert button

On decoder-only architecture for speech-to-text and large language model integration

Add code
Bookmark button
Alert button
Jul 08, 2023
Jian Wu, Yashesh Gaur, Zhuo Chen, Long Zhou, Yimeng Zhu, Tianrui Wang, Jinyu Li, Shujie Liu, Bo Ren, Linquan Liu, Yu Wu

Figure 1 for On decoder-only architecture for speech-to-text and large language model integration
Figure 2 for On decoder-only architecture for speech-to-text and large language model integration
Figure 3 for On decoder-only architecture for speech-to-text and large language model integration
Viaarxiv icon

Accelerating Transducers through Adjacent Token Merging

Add code
Bookmark button
Alert button
Jun 28, 2023
Yuang Li, Yu Wu, Jinyu Li, Shujie Liu

Figure 1 for Accelerating Transducers through Adjacent Token Merging
Figure 2 for Accelerating Transducers through Adjacent Token Merging
Figure 3 for Accelerating Transducers through Adjacent Token Merging
Figure 4 for Accelerating Transducers through Adjacent Token Merging
Viaarxiv icon

Prompting Large Language Models for Zero-Shot Domain Adaptation in Speech Recognition

Add code
Bookmark button
Alert button
Jun 28, 2023
Yuang Li, Yu Wu, Jinyu Li, Shujie Liu

Figure 1 for Prompting Large Language Models for Zero-Shot Domain Adaptation in Speech Recognition
Figure 2 for Prompting Large Language Models for Zero-Shot Domain Adaptation in Speech Recognition
Figure 3 for Prompting Large Language Models for Zero-Shot Domain Adaptation in Speech Recognition
Figure 4 for Prompting Large Language Models for Zero-Shot Domain Adaptation in Speech Recognition
Viaarxiv icon

VioLA: Unified Codec Language Models for Speech Recognition, Synthesis, and Translation

Add code
Bookmark button
Alert button
May 25, 2023
Tianrui Wang, Long Zhou, Ziqiang Zhang, Yu Wu, Shujie Liu, Yashesh Gaur, Zhuo Chen, Jinyu Li, Furu Wei

Figure 1 for VioLA: Unified Codec Language Models for Speech Recognition, Synthesis, and Translation
Figure 2 for VioLA: Unified Codec Language Models for Speech Recognition, Synthesis, and Translation
Figure 3 for VioLA: Unified Codec Language Models for Speech Recognition, Synthesis, and Translation
Figure 4 for VioLA: Unified Codec Language Models for Speech Recognition, Synthesis, and Translation
Viaarxiv icon

ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text Translation

Add code
Bookmark button
Alert button
May 24, 2023
Chenyang Le, Yao Qian, Long Zhou, Shujie Liu, Michael Zeng, Xuedong Huang

Figure 1 for ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text Translation
Figure 2 for ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text Translation
Figure 3 for ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text Translation
Figure 4 for ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text Translation
Viaarxiv icon

Code-Switching Text Generation and Injection in Mandarin-English ASR

Add code
Bookmark button
Alert button
Mar 20, 2023
Haibin Yu, Yuxuan Hu, Yao Qian, Ma Jin, Linquan Liu, Shujie Liu, Yu Shi, Yanmin Qian, Edward Lin, Michael Zeng

Figure 1 for Code-Switching Text Generation and Injection in Mandarin-English ASR
Figure 2 for Code-Switching Text Generation and Injection in Mandarin-English ASR
Figure 3 for Code-Switching Text Generation and Injection in Mandarin-English ASR
Figure 4 for Code-Switching Text Generation and Injection in Mandarin-English ASR
Viaarxiv icon

Target Sound Extraction with Variable Cross-modality Clues

Add code
Bookmark button
Alert button
Mar 15, 2023
Chenda Li, Yao Qian, Zhuo Chen, Dongmei Wang, Takuya Yoshioka, Shujie Liu, Yanmin Qian, Michael Zeng

Figure 1 for Target Sound Extraction with Variable Cross-modality Clues
Figure 2 for Target Sound Extraction with Variable Cross-modality Clues
Figure 3 for Target Sound Extraction with Variable Cross-modality Clues
Figure 4 for Target Sound Extraction with Variable Cross-modality Clues
Viaarxiv icon

Speak Foreign Languages with Your Own Voice: Cross-Lingual Neural Codec Language Modeling

Add code
Bookmark button
Alert button
Mar 07, 2023
Ziqiang Zhang, Long Zhou, Chengyi Wang, Sanyuan Chen, Yu Wu, Shujie Liu, Zhuo Chen, Yanqing Liu, Huaming Wang, Jinyu Li, Lei He, Sheng Zhao, Furu Wei

Figure 1 for Speak Foreign Languages with Your Own Voice: Cross-Lingual Neural Codec Language Modeling
Figure 2 for Speak Foreign Languages with Your Own Voice: Cross-Lingual Neural Codec Language Modeling
Figure 3 for Speak Foreign Languages with Your Own Voice: Cross-Lingual Neural Codec Language Modeling
Figure 4 for Speak Foreign Languages with Your Own Voice: Cross-Lingual Neural Codec Language Modeling
Viaarxiv icon

Building High-accuracy Multilingual ASR with Gated Language Experts and Curriculum Training

Add code
Bookmark button
Alert button
Mar 01, 2023
Eric Sun, Jinyu Li, Yuxuan Hu, Yimeng Zhu, Long Zhou, Jian Xue, Peidong Wang, Linquan Liu, Shujie Liu, Edward Lin, Yifan Gong

Figure 1 for Building High-accuracy Multilingual ASR with Gated Language Experts and Curriculum Training
Figure 2 for Building High-accuracy Multilingual ASR with Gated Language Experts and Curriculum Training
Figure 3 for Building High-accuracy Multilingual ASR with Gated Language Experts and Curriculum Training
Figure 4 for Building High-accuracy Multilingual ASR with Gated Language Experts and Curriculum Training
Viaarxiv icon

Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers

Add code
Bookmark button
Alert button
Jan 05, 2023
Chengyi Wang, Sanyuan Chen, Yu Wu, Ziqiang Zhang, Long Zhou, Shujie Liu, Zhuo Chen, Yanqing Liu, Huaming Wang, Jinyu Li, Lei He, Sheng Zhao, Furu Wei

Figure 1 for Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers
Figure 2 for Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers
Figure 3 for Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers
Figure 4 for Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers
Viaarxiv icon