Alert button
Picture for Chao Weng

Chao Weng

Alert button

Detect what you want: Target Sound Detection

Add code
Bookmark button
Alert button
Dec 19, 2021
Dongchao Yang, Helin Wang, Yuexian Zou, Chao Weng

Figure 1 for Detect what you want: Target Sound Detection
Figure 2 for Detect what you want: Target Sound Detection
Figure 3 for Detect what you want: Target Sound Detection
Figure 4 for Detect what you want: Target Sound Detection
Viaarxiv icon

Consistent Training and Decoding For End-to-end Speech Recognition Using Lattice-free MMI

Add code
Bookmark button
Alert button
Dec 05, 2021
Jinchuan Tian, Jianwei Yu, Chao Weng, Shi-Xiong Zhang, Dan Su, Dong Yu, Yuexian Zou

Figure 1 for Consistent Training and Decoding For End-to-end Speech Recognition Using Lattice-free MMI
Figure 2 for Consistent Training and Decoding For End-to-end Speech Recognition Using Lattice-free MMI
Figure 3 for Consistent Training and Decoding For End-to-end Speech Recognition Using Lattice-free MMI
Figure 4 for Consistent Training and Decoding For End-to-end Speech Recognition Using Lattice-free MMI
Viaarxiv icon

Joint Modeling of Code-Switched and Monolingual ASR via Conditional Factorization

Add code
Bookmark button
Alert button
Nov 29, 2021
Brian Yan, Chunlei Zhang, Meng Yu, Shi-Xiong Zhang, Siddharth Dalmia, Dan Berrebbi, Chao Weng, Shinji Watanabe, Dong Yu

Figure 1 for Joint Modeling of Code-Switched and Monolingual ASR via Conditional Factorization
Figure 2 for Joint Modeling of Code-Switched and Monolingual ASR via Conditional Factorization
Figure 3 for Joint Modeling of Code-Switched and Monolingual ASR via Conditional Factorization
Figure 4 for Joint Modeling of Code-Switched and Monolingual ASR via Conditional Factorization
Viaarxiv icon

Simple Attention Module based Speaker Verification with Iterative noisy label detection

Add code
Bookmark button
Alert button
Oct 13, 2021
Xiaoyi Qin, Na Li, Chao Weng, Dan Su, Ming Li

Figure 1 for Simple Attention Module based Speaker Verification with Iterative noisy label detection
Figure 2 for Simple Attention Module based Speaker Verification with Iterative noisy label detection
Figure 3 for Simple Attention Module based Speaker Verification with Iterative noisy label detection
Figure 4 for Simple Attention Module based Speaker Verification with Iterative noisy label detection
Viaarxiv icon

GigaSpeech: An Evolving, Multi-domain ASR Corpus with 10,000 Hours of Transcribed Audio

Add code
Bookmark button
Alert button
Jun 13, 2021
Guoguo Chen, Shuzhou Chai, Guanbo Wang, Jiayu Du, Wei-Qiang Zhang, Chao Weng, Dan Su, Daniel Povey, Jan Trmal, Junbo Zhang, Mingjie Jin, Sanjeev Khudanpur, Shinji Watanabe, Shuaijiang Zhao, Wei Zou, Xiangang Li, Xuchen Yao, Yongqing Wang, Yujun Wang, Zhao You, Zhiyong Yan

Figure 1 for GigaSpeech: An Evolving, Multi-domain ASR Corpus with 10,000 Hours of Transcribed Audio
Figure 2 for GigaSpeech: An Evolving, Multi-domain ASR Corpus with 10,000 Hours of Transcribed Audio
Figure 3 for GigaSpeech: An Evolving, Multi-domain ASR Corpus with 10,000 Hours of Transcribed Audio
Figure 4 for GigaSpeech: An Evolving, Multi-domain ASR Corpus with 10,000 Hours of Transcribed Audio
Viaarxiv icon

Spoken Style Learning with Multi-modal Hierarchical Context Encoding for Conversational Text-to-Speech Synthesis

Add code
Bookmark button
Alert button
Jun 11, 2021
Jingbei Li, Yi Meng, Chenyi Li, Zhiyong Wu, Helen Meng, Chao Weng, Dan Su

Figure 1 for Spoken Style Learning with Multi-modal Hierarchical Context Encoding for Conversational Text-to-Speech Synthesis
Figure 2 for Spoken Style Learning with Multi-modal Hierarchical Context Encoding for Conversational Text-to-Speech Synthesis
Figure 3 for Spoken Style Learning with Multi-modal Hierarchical Context Encoding for Conversational Text-to-Speech Synthesis
Figure 4 for Spoken Style Learning with Multi-modal Hierarchical Context Encoding for Conversational Text-to-Speech Synthesis
Viaarxiv icon

Raw Waveform Encoder with Multi-Scale Globally Attentive Locally Recurrent Networks for End-to-End Speech Recognition

Add code
Bookmark button
Alert button
Jun 08, 2021
Max W. Y. Lam, Jun Wang, Chao Weng, Dan Su, Dong Yu

Figure 1 for Raw Waveform Encoder with Multi-Scale Globally Attentive Locally Recurrent Networks for End-to-End Speech Recognition
Figure 2 for Raw Waveform Encoder with Multi-Scale Globally Attentive Locally Recurrent Networks for End-to-End Speech Recognition
Figure 3 for Raw Waveform Encoder with Multi-Scale Globally Attentive Locally Recurrent Networks for End-to-End Speech Recognition
Figure 4 for Raw Waveform Encoder with Multi-Scale Globally Attentive Locally Recurrent Networks for End-to-End Speech Recognition
Viaarxiv icon

TeCANet: Temporal-Contextual Attention Network for Environment-Aware Speech Dereverberation

Add code
Bookmark button
Alert button
Mar 31, 2021
Helin Wang, Bo Wu, Lianwu Chen, Meng Yu, Jianwei Yu, Yong Xu, Shi-Xiong Zhang, Chao Weng, Dan Su, Dong Yu

Figure 1 for TeCANet: Temporal-Contextual Attention Network for Environment-Aware Speech Dereverberation
Figure 2 for TeCANet: Temporal-Contextual Attention Network for Environment-Aware Speech Dereverberation
Figure 3 for TeCANet: Temporal-Contextual Attention Network for Environment-Aware Speech Dereverberation
Figure 4 for TeCANet: Temporal-Contextual Attention Network for Environment-Aware Speech Dereverberation
Viaarxiv icon

Towards Robust Speaker Verification with Target Speaker Enhancement

Add code
Bookmark button
Alert button
Mar 16, 2021
Chunlei Zhang, Meng Yu, Chao Weng, Dong Yu

Figure 1 for Towards Robust Speaker Verification with Target Speaker Enhancement
Figure 2 for Towards Robust Speaker Verification with Target Speaker Enhancement
Figure 3 for Towards Robust Speaker Verification with Target Speaker Enhancement
Figure 4 for Towards Robust Speaker Verification with Target Speaker Enhancement
Viaarxiv icon

Deep Learning based Multi-Source Localization with Source Splitting and its Effectiveness in Multi-Talker Speech Recognition

Add code
Bookmark button
Alert button
Feb 16, 2021
Aswin Shanmugam Subramanian, Chao Weng, Shinji Watanabe, Meng Yu, Dong Yu

Figure 1 for Deep Learning based Multi-Source Localization with Source Splitting and its Effectiveness in Multi-Talker Speech Recognition
Figure 2 for Deep Learning based Multi-Source Localization with Source Splitting and its Effectiveness in Multi-Talker Speech Recognition
Figure 3 for Deep Learning based Multi-Source Localization with Source Splitting and its Effectiveness in Multi-Talker Speech Recognition
Figure 4 for Deep Learning based Multi-Source Localization with Source Splitting and its Effectiveness in Multi-Talker Speech Recognition
Viaarxiv icon