Alert button
Picture for Xuankai Chang

Xuankai Chang

Alert button

ML-SUPERB: Multilingual Speech Universal PERformance Benchmark

Add code
Bookmark button
Alert button
May 18, 2023
Jiatong Shi, Dan Berrebbi, William Chen, Ho-Lam Chung, En-Pei Hu, Wei Ping Huang, Xuankai Chang, Shang-Wen Li, Abdelrahman Mohamed, Hung-yi Lee, Shinji Watanabe

Figure 1 for ML-SUPERB: Multilingual Speech Universal PERformance Benchmark
Figure 2 for ML-SUPERB: Multilingual Speech Universal PERformance Benchmark
Figure 3 for ML-SUPERB: Multilingual Speech Universal PERformance Benchmark
Viaarxiv icon

AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head

Add code
Bookmark button
Alert button
Apr 25, 2023
Rongjie Huang, Mingze Li, Dongchao Yang, Jiatong Shi, Xuankai Chang, Zhenhui Ye, Yuning Wu, Zhiqing Hong, Jiawei Huang, Jinglin Liu, Yi Ren, Zhou Zhao, Shinji Watanabe

Figure 1 for AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
Figure 2 for AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
Figure 3 for AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
Figure 4 for AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
Viaarxiv icon

Improving Perceptual Quality, Intelligibility, and Acoustics on VoIP Platforms

Add code
Bookmark button
Alert button
Mar 16, 2023
Joseph Konan, Ojas Bhargave, Shikhar Agnihotri, Hojeong Lee, Ankit Shah, Shuo Han, Yunyang Zeng, Amanda Shu, Haohui Liu, Xuankai Chang, Hamza Khalid, Minseon Gwak, Kawon Lee, Minjeong Kim, Bhiksha Raj

Figure 1 for Improving Perceptual Quality, Intelligibility, and Acoustics on VoIP Platforms
Figure 2 for Improving Perceptual Quality, Intelligibility, and Acoustics on VoIP Platforms
Figure 3 for Improving Perceptual Quality, Intelligibility, and Acoustics on VoIP Platforms
Figure 4 for Improving Perceptual Quality, Intelligibility, and Acoustics on VoIP Platforms
Viaarxiv icon

A Study on the Integration of Pre-trained SSL, ASR, LM and SLU Models for Spoken Language Understanding

Add code
Bookmark button
Alert button
Nov 10, 2022
Yifan Peng, Siddhant Arora, Yosuke Higuchi, Yushi Ueda, Sujay Kumar, Karthik Ganesan, Siddharth Dalmia, Xuankai Chang, Shinji Watanabe

Figure 1 for A Study on the Integration of Pre-trained SSL, ASR, LM and SLU Models for Spoken Language Understanding
Figure 2 for A Study on the Integration of Pre-trained SSL, ASR, LM and SLU Models for Spoken Language Understanding
Figure 3 for A Study on the Integration of Pre-trained SSL, ASR, LM and SLU Models for Spoken Language Understanding
Figure 4 for A Study on the Integration of Pre-trained SSL, ASR, LM and SLU Models for Spoken Language Understanding
Viaarxiv icon

End-to-End Integration of Speech Recognition, Dereverberation, Beamforming, and Self-Supervised Learning Representation

Add code
Bookmark button
Alert button
Oct 19, 2022
Yoshiki Masuyama, Xuankai Chang, Samuele Cornell, Shinji Watanabe, Nobutaka Ono

Figure 1 for End-to-End Integration of Speech Recognition, Dereverberation, Beamforming, and Self-Supervised Learning Representation
Figure 2 for End-to-End Integration of Speech Recognition, Dereverberation, Beamforming, and Self-Supervised Learning Representation
Figure 3 for End-to-End Integration of Speech Recognition, Dereverberation, Beamforming, and Self-Supervised Learning Representation
Figure 4 for End-to-End Integration of Speech Recognition, Dereverberation, Beamforming, and Self-Supervised Learning Representation
Viaarxiv icon

SUPERB @ SLT 2022: Challenge on Generalization and Efficiency of Self-Supervised Speech Representation Learning

Add code
Bookmark button
Alert button
Oct 16, 2022
Tzu-hsun Feng, Annie Dong, Ching-Feng Yeh, Shu-wen Yang, Tzu-Quan Lin, Jiatong Shi, Kai-Wei Chang, Zili Huang, Haibin Wu, Xuankai Chang, Shinji Watanabe, Abdelrahman Mohamed, Shang-Wen Li, Hung-yi Lee

Figure 1 for SUPERB @ SLT 2022: Challenge on Generalization and Efficiency of Self-Supervised Speech Representation Learning
Figure 2 for SUPERB @ SLT 2022: Challenge on Generalization and Efficiency of Self-Supervised Speech Representation Learning
Figure 3 for SUPERB @ SLT 2022: Challenge on Generalization and Efficiency of Self-Supervised Speech Representation Learning
Figure 4 for SUPERB @ SLT 2022: Challenge on Generalization and Efficiency of Self-Supervised Speech Representation Learning
Viaarxiv icon

ESPnet-SE++: Speech Enhancement for Robust Speech Recognition, Translation, and Understanding

Add code
Bookmark button
Alert button
Jul 19, 2022
Yen-Ju Lu, Xuankai Chang, Chenda Li, Wangyou Zhang, Samuele Cornell, Zhaoheng Ni, Yoshiki Masuyama, Brian Yan, Robin Scheibler, Zhong-Qiu Wang, Yu Tsao, Yanmin Qian, Shinji Watanabe

Figure 1 for ESPnet-SE++: Speech Enhancement for Robust Speech Recognition, Translation, and Understanding
Figure 2 for ESPnet-SE++: Speech Enhancement for Robust Speech Recognition, Translation, and Understanding
Figure 3 for ESPnet-SE++: Speech Enhancement for Robust Speech Recognition, Translation, and Understanding
Figure 4 for ESPnet-SE++: Speech Enhancement for Robust Speech Recognition, Translation, and Understanding
Viaarxiv icon

Two-Pass Low Latency End-to-End Spoken Language Understanding

Add code
Bookmark button
Alert button
Jul 14, 2022
Siddhant Arora, Siddharth Dalmia, Xuankai Chang, Brian Yan, Alan Black, Shinji Watanabe

Figure 1 for Two-Pass Low Latency End-to-End Spoken Language Understanding
Figure 2 for Two-Pass Low Latency End-to-End Spoken Language Understanding
Figure 3 for Two-Pass Low Latency End-to-End Spoken Language Understanding
Figure 4 for Two-Pass Low Latency End-to-End Spoken Language Understanding
Viaarxiv icon

Muskits: an End-to-End Music Processing Toolkit for Singing Voice Synthesis

Add code
Bookmark button
Alert button
May 09, 2022
Jiatong Shi, Shuai Guo, Tao Qian, Nan Huo, Tomoki Hayashi, Yuning Wu, Frank Xu, Xuankai Chang, Huazhe Li, Peter Wu, Shinji Watanabe, Qin Jin

Figure 1 for Muskits: an End-to-End Music Processing Toolkit for Singing Voice Synthesis
Figure 2 for Muskits: an End-to-End Music Processing Toolkit for Singing Voice Synthesis
Figure 3 for Muskits: an End-to-End Music Processing Toolkit for Singing Voice Synthesis
Figure 4 for Muskits: an End-to-End Music Processing Toolkit for Singing Voice Synthesis
Viaarxiv icon

End-to-End Integration of Speech Recognition, Speech Enhancement, and Self-Supervised Learning Representation

Add code
Bookmark button
Alert button
Apr 01, 2022
Xuankai Chang, Takashi Maekaku, Yuya Fujita, Shinji Watanabe

Figure 1 for End-to-End Integration of Speech Recognition, Speech Enhancement, and Self-Supervised Learning Representation
Figure 2 for End-to-End Integration of Speech Recognition, Speech Enhancement, and Self-Supervised Learning Representation
Figure 3 for End-to-End Integration of Speech Recognition, Speech Enhancement, and Self-Supervised Learning Representation
Figure 4 for End-to-End Integration of Speech Recognition, Speech Enhancement, and Self-Supervised Learning Representation
Viaarxiv icon