Alert button
Picture for Shinji Watanabe

Shinji Watanabe

Alert button

A Large-Scale Evaluation of Speech Foundation Models

Add code
Bookmark button
Alert button
Apr 15, 2024
Shu-wen Yang, Heng-Jui Chang, Zili Huang, Andy T. Liu, Cheng-I Lai, Haibin Wu, Jiatong Shi, Xuankai Chang, Hsiang-Sheng Tsai, Wen-Chin Huang, Tzu-hsun Feng, Po-Han Chi, Yist Y. Lin, Yung-Sung Chuang, Tzu-Hsien Huang, Wei-Cheng Tseng, Kushal Lakhotia, Shang-Wen Li, Abdelrahman Mohamed, Shinji Watanabe, Hung-yi Lee

Viaarxiv icon

LV-CTC: Non-autoregressive ASR with CTC and latent variable models

Add code
Bookmark button
Alert button
Mar 28, 2024
Yuya Fujita, Shinji Watanabe, Xuankai Chang, Takashi Maekaku

Viaarxiv icon

Wav2Gloss: Generating Interlinear Glossed Text from Speech

Add code
Bookmark button
Alert button
Mar 19, 2024
Taiqi He, Kwanghee Choi, Lindia Tjuatja, Nathaniel R. Robinson, Jiatong Shi, Shinji Watanabe, Graham Neubig, David R. Mortensen, Lori Levin

Figure 1 for Wav2Gloss: Generating Interlinear Glossed Text from Speech
Figure 2 for Wav2Gloss: Generating Interlinear Glossed Text from Speech
Figure 3 for Wav2Gloss: Generating Interlinear Glossed Text from Speech
Figure 4 for Wav2Gloss: Generating Interlinear Glossed Text from Speech
Viaarxiv icon

Aligning Speech to Languages to Enhance Code-switching Speech Recognition

Add code
Bookmark button
Alert button
Mar 09, 2024
Hexin Liu, Xiangyu Zhang, Leibny Paola Garcia, Andy W. H. Khong, Eng Siong Chng, Shinji Watanabe

Figure 1 for Aligning Speech to Languages to Enhance Code-switching Speech Recognition
Figure 2 for Aligning Speech to Languages to Enhance Code-switching Speech Recognition
Figure 3 for Aligning Speech to Languages to Enhance Code-switching Speech Recognition
Figure 4 for Aligning Speech to Languages to Enhance Code-switching Speech Recognition
Viaarxiv icon

TMT: Tri-Modal Translation between Speech, Image, and Text by Processing Different Modalities as Different Languages

Add code
Bookmark button
Alert button
Feb 25, 2024
Minsu Kim, Jee-weon Jung, Hyeongseop Rha, Soumi Maiti, Siddhant Arora, Xuankai Chang, Shinji Watanabe, Yong Man Ro

Viaarxiv icon

OWSM-CTC: An Open Encoder-Only Speech Foundation Model for Speech Recognition, Translation, and Language Identification

Add code
Bookmark button
Alert button
Feb 20, 2024
Yifan Peng, Yui Sudo, Muhammad Shakeel, Shinji Watanabe

Viaarxiv icon

Evaluating and Improving Continual Learning in Spoken Language Understanding

Add code
Bookmark button
Alert button
Feb 16, 2024
Muqiao Yang, Xiang Li, Umberto Cappellazzo, Shinji Watanabe, Bhiksha Raj

Viaarxiv icon

Can you Remove the Downstream Model for Speaker Recognition with Self-Supervised Speech Features?

Add code
Bookmark button
Alert button
Feb 01, 2024
Zakaria Aldeneh, Takuya Higuchi, Jee-weon Jung, Skyler Seto, Tatiana Likhomanenko, Stephen Shum, Ahmed Hussen Abdelaziz, Shinji Watanabe, Barry-John Theobald

Viaarxiv icon

SpeechComposer: Unifying Multiple Speech Tasks with Prompt Composition

Add code
Bookmark button
Alert button
Jan 31, 2024
Yihan Wu, Soumi Maiti, Yifan Peng, Wangyou Zhang, Chenda Li, Yuyue Wang, Xihua Wang, Shinji Watanabe, Ruihua Song

Viaarxiv icon

Singing Voice Data Scaling-up: An Introduction to ACE-Opencpop and KiSing-v2

Add code
Bookmark button
Alert button
Jan 31, 2024
Jiatong Shi, Yueqian Lin, Xinyi Bai, Keyi Zhang, Yuning Wu, Yuxun Tang, Yifeng Yu, Qin Jin, Shinji Watanabe

Viaarxiv icon