Alert button
Picture for Wangyou Zhang

Wangyou Zhang

Alert button

End-to-End Multi-speaker ASR with Independent Vector Analysis

Add code
Bookmark button
Alert button
Apr 01, 2022
Robin Scheibler, Wangyou Zhang, Xuankai Chang, Shinji Watanabe, Yanmin Qian

Figure 1 for End-to-End Multi-speaker ASR with Independent Vector Analysis
Figure 2 for End-to-End Multi-speaker ASR with Independent Vector Analysis
Figure 3 for End-to-End Multi-speaker ASR with Independent Vector Analysis
Figure 4 for End-to-End Multi-speaker ASR with Independent Vector Analysis
Viaarxiv icon

Towards Low-distortion Multi-channel Speech Enhancement: The ESPNet-SE Submission to The L3DAS22 Challenge

Add code
Bookmark button
Alert button
Feb 24, 2022
Yen-Ju Lu, Samuele Cornell, Xuankai Chang, Wangyou Zhang, Chenda Li, Zhaoheng Ni, Zhong-Qiu Wang, Shinji Watanabe

Figure 1 for Towards Low-distortion Multi-channel Speech Enhancement: The ESPNet-SE Submission to The L3DAS22 Challenge
Figure 2 for Towards Low-distortion Multi-channel Speech Enhancement: The ESPNet-SE Submission to The L3DAS22 Challenge
Figure 3 for Towards Low-distortion Multi-channel Speech Enhancement: The ESPNet-SE Submission to The L3DAS22 Challenge
Figure 4 for Towards Low-distortion Multi-channel Speech Enhancement: The ESPNet-SE Submission to The L3DAS22 Challenge
Viaarxiv icon

Separating Long-Form Speech with Group-Wise Permutation Invariant Training

Add code
Bookmark button
Alert button
Nov 17, 2021
Wangyou Zhang, Zhuo Chen, Naoyuki Kanda, Shujie Liu, Jinyu Li, Sefik Emre Eskimez, Takuya Yoshioka, Xiong Xiao, Zhong Meng, Yanmin Qian, Furu Wei

Figure 1 for Separating Long-Form Speech with Group-Wise Permutation Invariant Training
Figure 2 for Separating Long-Form Speech with Group-Wise Permutation Invariant Training
Figure 3 for Separating Long-Form Speech with Group-Wise Permutation Invariant Training
Figure 4 for Separating Long-Form Speech with Group-Wise Permutation Invariant Training
Viaarxiv icon

Closing the Gap Between Time-Domain Multi-Channel Speech Enhancement on Real and Simulation Conditions

Add code
Bookmark button
Alert button
Oct 27, 2021
Wangyou Zhang, Jing Shi, Chenda Li, Shinji Watanabe, Yanmin Qian

Figure 1 for Closing the Gap Between Time-Domain Multi-Channel Speech Enhancement on Real and Simulation Conditions
Figure 2 for Closing the Gap Between Time-Domain Multi-Channel Speech Enhancement on Real and Simulation Conditions
Figure 3 for Closing the Gap Between Time-Domain Multi-Channel Speech Enhancement on Real and Simulation Conditions
Viaarxiv icon

End-to-End Dereverberation, Beamforming, and Speech Recognition with Improved Numerical Stability and Advanced Frontend

Add code
Bookmark button
Alert button
Feb 23, 2021
Wangyou Zhang, Christoph Boeddeker, Shinji Watanabe, Tomohiro Nakatani, Marc Delcroix, Keisuke Kinoshita, Tsubasa Ochiai, Naoyuki Kamo, Reinhold Haeb-Umbach, Yanmin Qian

Figure 1 for End-to-End Dereverberation, Beamforming, and Speech Recognition with Improved Numerical Stability and Advanced Frontend
Figure 2 for End-to-End Dereverberation, Beamforming, and Speech Recognition with Improved Numerical Stability and Advanced Frontend
Figure 3 for End-to-End Dereverberation, Beamforming, and Speech Recognition with Improved Numerical Stability and Advanced Frontend
Viaarxiv icon

The 2020 ESPnet update: new features, broadened applications, performance improvements, and future plans

Add code
Bookmark button
Alert button
Dec 23, 2020
Shinji Watanabe, Florian Boyer, Xuankai Chang, Pengcheng Guo, Tomoki Hayashi, Yosuke Higuchi, Takaaki Hori, Wen-Chin Huang, Hirofumi Inaguma, Naoyuki Kamo, Shigeki Karita, Chenda Li, Jing Shi, Aswin Shanmugam Subramanian, Wangyou Zhang

Figure 1 for The 2020 ESPnet update: new features, broadened applications, performance improvements, and future plans
Figure 2 for The 2020 ESPnet update: new features, broadened applications, performance improvements, and future plans
Viaarxiv icon

End-to-End Multi-speaker Speech Recognition with Transformer

Add code
Bookmark button
Alert button
Feb 13, 2020
Xuankai Chang, Wangyou Zhang, Yanmin Qian, Jonathan Le Roux, Shinji Watanabe

Figure 1 for End-to-End Multi-speaker Speech Recognition with Transformer
Figure 2 for End-to-End Multi-speaker Speech Recognition with Transformer
Figure 3 for End-to-End Multi-speaker Speech Recognition with Transformer
Figure 4 for End-to-End Multi-speaker Speech Recognition with Transformer
Viaarxiv icon

MIMO-SPEECH: End-to-End Multi-Channel Multi-Speaker Speech Recognition

Add code
Bookmark button
Alert button
Oct 16, 2019
Xuankai Chang, Wangyou Zhang, Yanmin Qian, Jonathan Le Roux, Shinji Watanabe

Figure 1 for MIMO-SPEECH: End-to-End Multi-Channel Multi-Speaker Speech Recognition
Figure 2 for MIMO-SPEECH: End-to-End Multi-Channel Multi-Speaker Speech Recognition
Figure 3 for MIMO-SPEECH: End-to-End Multi-Channel Multi-Speaker Speech Recognition
Figure 4 for MIMO-SPEECH: End-to-End Multi-Channel Multi-Speaker Speech Recognition
Viaarxiv icon

A Comparative Study on Transformer vs RNN in Speech Applications

Add code
Bookmark button
Alert button
Sep 28, 2019
Shigeki Karita, Nanxin Chen, Tomoki Hayashi, Takaaki Hori, Hirofumi Inaguma, Ziyan Jiang, Masao Someki, Nelson Enrique Yalta Soplin, Ryuichi Yamamoto, Xiaofei Wang, Shinji Watanabe, Takenori Yoshimura, Wangyou Zhang

Figure 1 for A Comparative Study on Transformer vs RNN in Speech Applications
Figure 2 for A Comparative Study on Transformer vs RNN in Speech Applications
Figure 3 for A Comparative Study on Transformer vs RNN in Speech Applications
Figure 4 for A Comparative Study on Transformer vs RNN in Speech Applications
Viaarxiv icon