Alert button
Picture for Zili Huang

Zili Huang

Alert button

A Large-Scale Evaluation of Speech Foundation Models

Add code
Bookmark button
Alert button
Apr 15, 2024
Shu-wen Yang, Heng-Jui Chang, Zili Huang, Andy T. Liu, Cheng-I Lai, Haibin Wu, Jiatong Shi, Xuankai Chang, Hsiang-Sheng Tsai, Wen-Chin Huang, Tzu-hsun Feng, Po-Han Chi, Yist Y. Lin, Yung-Sung Chuang, Tzu-Hsien Huang, Wei-Cheng Tseng, Kushal Lakhotia, Shang-Wen Li, Abdelrahman Mohamed, Shinji Watanabe, Hung-yi Lee

Viaarxiv icon

UniX-Encoder: A Universal $X$-Channel Speech Encoder for Ad-Hoc Microphone Array Speech Processing

Add code
Bookmark button
Alert button
Oct 25, 2023
Zili Huang, Yiwen Shao, Shi-Xiong Zhang, Dong Yu

Viaarxiv icon

Self-supervised learning with bi-label masked speech prediction for streaming multi-talker speech recognition

Add code
Bookmark button
Alert button
Nov 10, 2022
Zili Huang, Zhuo Chen, Naoyuki Kanda, Jian Wu, Yiming Wang, Jinyu Li, Takuya Yoshioka, Xiaofei Wang, Peidong Wang

Figure 1 for Self-supervised learning with bi-label masked speech prediction for streaming multi-talker speech recognition
Figure 2 for Self-supervised learning with bi-label masked speech prediction for streaming multi-talker speech recognition
Figure 3 for Self-supervised learning with bi-label masked speech prediction for streaming multi-talker speech recognition
Figure 4 for Self-supervised learning with bi-label masked speech prediction for streaming multi-talker speech recognition
Viaarxiv icon

Adapting self-supervised models to multi-talker speech recognition using speaker embeddings

Add code
Bookmark button
Alert button
Nov 01, 2022
Zili Huang, Desh Raj, Paola García, Sanjeev Khudanpur

Figure 1 for Adapting self-supervised models to multi-talker speech recognition using speaker embeddings
Figure 2 for Adapting self-supervised models to multi-talker speech recognition using speaker embeddings
Figure 3 for Adapting self-supervised models to multi-talker speech recognition using speaker embeddings
Figure 4 for Adapting self-supervised models to multi-talker speech recognition using speaker embeddings
Viaarxiv icon

SUPERB @ SLT 2022: Challenge on Generalization and Efficiency of Self-Supervised Speech Representation Learning

Add code
Bookmark button
Alert button
Oct 16, 2022
Tzu-hsun Feng, Annie Dong, Ching-Feng Yeh, Shu-wen Yang, Tzu-Quan Lin, Jiatong Shi, Kai-Wei Chang, Zili Huang, Haibin Wu, Xuankai Chang, Shinji Watanabe, Abdelrahman Mohamed, Shang-Wen Li, Hung-yi Lee

Figure 1 for SUPERB @ SLT 2022: Challenge on Generalization and Efficiency of Self-Supervised Speech Representation Learning
Figure 2 for SUPERB @ SLT 2022: Challenge on Generalization and Efficiency of Self-Supervised Speech Representation Learning
Figure 3 for SUPERB @ SLT 2022: Challenge on Generalization and Efficiency of Self-Supervised Speech Representation Learning
Figure 4 for SUPERB @ SLT 2022: Challenge on Generalization and Efficiency of Self-Supervised Speech Representation Learning
Viaarxiv icon

Investigating self-supervised learning for speech enhancement and separation

Add code
Bookmark button
Alert button
Mar 15, 2022
Zili Huang, Shinji Watanabe, Shu-wen Yang, Paola Garcia, Sanjeev Khudanpur

Figure 1 for Investigating self-supervised learning for speech enhancement and separation
Figure 2 for Investigating self-supervised learning for speech enhancement and separation
Figure 3 for Investigating self-supervised learning for speech enhancement and separation
Figure 4 for Investigating self-supervised learning for speech enhancement and separation
Viaarxiv icon

SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark for Semantic and Generative Capabilities

Add code
Bookmark button
Alert button
Mar 14, 2022
Hsiang-Sheng Tsai, Heng-Jui Chang, Wen-Chin Huang, Zili Huang, Kushal Lakhotia, Shu-wen Yang, Shuyan Dong, Andy T. Liu, Cheng-I Jeff Lai, Jiatong Shi, Xuankai Chang, Phil Hall, Hsuan-Jui Chen, Shang-Wen Li, Shinji Watanabe, Abdelrahman Mohamed, Hung-yi Lee

Figure 1 for SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark for Semantic and Generative Capabilities
Figure 2 for SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark for Semantic and Generative Capabilities
Figure 3 for SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark for Semantic and Generative Capabilities
Figure 4 for SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark for Semantic and Generative Capabilities
Viaarxiv icon

Target-speaker Voice Activity Detection with Improved I-Vector Estimation for Unknown Number of Speaker

Add code
Bookmark button
Alert button
Aug 07, 2021
Maokui He, Desh Raj, Zili Huang, Jun Du, Zhuo Chen, Shinji Watanabe

Figure 1 for Target-speaker Voice Activity Detection with Improved I-Vector Estimation for Unknown Number of Speaker
Figure 2 for Target-speaker Voice Activity Detection with Improved I-Vector Estimation for Unknown Number of Speaker
Figure 3 for Target-speaker Voice Activity Detection with Improved I-Vector Estimation for Unknown Number of Speaker
Figure 4 for Target-speaker Voice Activity Detection with Improved I-Vector Estimation for Unknown Number of Speaker
Viaarxiv icon

SUPERB: Speech processing Universal PERformance Benchmark

Add code
Bookmark button
Alert button
May 03, 2021
Shu-wen Yang, Po-Han Chi, Yung-Sung Chuang, Cheng-I Jeff Lai, Kushal Lakhotia, Yist Y. Lin, Andy T. Liu, Jiatong Shi, Xuankai Chang, Guan-Ting Lin, Tzu-Hsien Huang, Wei-Cheng Tseng, Ko-tik Lee, Da-Rong Liu, Zili Huang, Shuyan Dong, Shang-Wen Li, Shinji Watanabe, Abdelrahman Mohamed, Hung-yi Lee

Figure 1 for SUPERB: Speech processing Universal PERformance Benchmark
Figure 2 for SUPERB: Speech processing Universal PERformance Benchmark
Viaarxiv icon

The Hitachi-JHU DIHARD III System: Competitive End-to-End Neural Diarization and X-Vector Clustering Systems Combined by DOVER-Lap

Add code
Bookmark button
Alert button
Feb 02, 2021
Shota Horiguchi, Nelson Yalta, Paola Garcia, Yuki Takashima, Yawen Xue, Desh Raj, Zili Huang, Yusuke Fujita, Shinji Watanabe, Sanjeev Khudanpur

Figure 1 for The Hitachi-JHU DIHARD III System: Competitive End-to-End Neural Diarization and X-Vector Clustering Systems Combined by DOVER-Lap
Figure 2 for The Hitachi-JHU DIHARD III System: Competitive End-to-End Neural Diarization and X-Vector Clustering Systems Combined by DOVER-Lap
Figure 3 for The Hitachi-JHU DIHARD III System: Competitive End-to-End Neural Diarization and X-Vector Clustering Systems Combined by DOVER-Lap
Figure 4 for The Hitachi-JHU DIHARD III System: Competitive End-to-End Neural Diarization and X-Vector Clustering Systems Combined by DOVER-Lap
Viaarxiv icon