Alert button
Picture for Naoyuki Kanda

Naoyuki Kanda

Alert button

A Comparative Study of Modular and Joint Approaches for Speaker-Attributed ASR on Monaural Long-Form Audio

Jul 06, 2021
Naoyuki Kanda, Xiong Xiao, Jian Wu, Tianyan Zhou, Yashesh Gaur, Xiaofei Wang, Zhong Meng, Zhuo Chen, Takuya Yoshioka

Figure 1 for A Comparative Study of Modular and Joint Approaches for Speaker-Attributed ASR on Monaural Long-Form Audio
Figure 2 for A Comparative Study of Modular and Joint Approaches for Speaker-Attributed ASR on Monaural Long-Form Audio
Figure 3 for A Comparative Study of Modular and Joint Approaches for Speaker-Attributed ASR on Monaural Long-Form Audio
Figure 4 for A Comparative Study of Modular and Joint Approaches for Speaker-Attributed ASR on Monaural Long-Form Audio
Viaarxiv icon

Investigation of Practical Aspects of Single Channel Speech Separation for ASR

Jul 05, 2021
Jian Wu, Zhuo Chen, Sanyuan Chen, Yu Wu, Takuya Yoshioka, Naoyuki Kanda, Shujie Liu, Jinyu Li

Figure 1 for Investigation of Practical Aspects of Single Channel Speech Separation for ASR
Figure 2 for Investigation of Practical Aspects of Single Channel Speech Separation for ASR
Figure 3 for Investigation of Practical Aspects of Single Channel Speech Separation for ASR
Viaarxiv icon

Minimum Word Error Rate Training with Language Model Fusion for End-to-End Speech Recognition

Jun 04, 2021
Zhong Meng, Yu Wu, Naoyuki Kanda, Liang Lu, Xie Chen, Guoli Ye, Eric Sun, Jinyu Li, Yifan Gong

Figure 1 for Minimum Word Error Rate Training with Language Model Fusion for End-to-End Speech Recognition
Figure 2 for Minimum Word Error Rate Training with Language Model Fusion for End-to-End Speech Recognition
Figure 3 for Minimum Word Error Rate Training with Language Model Fusion for End-to-End Speech Recognition
Viaarxiv icon

Large-Scale Pre-Training of End-to-End Multi-Talker ASR for Meeting Transcription with Single Distant Microphone

Apr 12, 2021
Naoyuki Kanda, Guoli Ye, Yu Wu, Yashesh Gaur, Xiaofei Wang, Zhong Meng, Zhuo Chen, Takuya Yoshioka

Figure 1 for Large-Scale Pre-Training of End-to-End Multi-Talker ASR for Meeting Transcription with Single Distant Microphone
Figure 2 for Large-Scale Pre-Training of End-to-End Multi-Talker ASR for Meeting Transcription with Single Distant Microphone
Figure 3 for Large-Scale Pre-Training of End-to-End Multi-Talker ASR for Meeting Transcription with Single Distant Microphone
Viaarxiv icon

End-to-End Speaker-Attributed ASR with Transformer

Apr 05, 2021
Naoyuki Kanda, Guoli Ye, Yashesh Gaur, Xiaofei Wang, Zhong Meng, Zhuo Chen, Takuya Yoshioka

Figure 1 for End-to-End Speaker-Attributed ASR with Transformer
Figure 2 for End-to-End Speaker-Attributed ASR with Transformer
Figure 3 for End-to-End Speaker-Attributed ASR with Transformer
Figure 4 for End-to-End Speaker-Attributed ASR with Transformer
Viaarxiv icon

Streaming Multi-talker Speech Recognition with Joint Speaker Identification

Apr 05, 2021
Liang Lu, Naoyuki Kanda, Jinyu Li, Yifan Gong

Figure 1 for Streaming Multi-talker Speech Recognition with Joint Speaker Identification
Figure 2 for Streaming Multi-talker Speech Recognition with Joint Speaker Identification
Figure 3 for Streaming Multi-talker Speech Recognition with Joint Speaker Identification
Figure 4 for Streaming Multi-talker Speech Recognition with Joint Speaker Identification
Viaarxiv icon

Speech-language Pre-training for End-to-end Spoken Language Understanding

Feb 11, 2021
Yao Qian, Ximo Bian, Yu Shi, Naoyuki Kanda, Leo Shen, Zhen Xiao, Michael Zeng

Figure 1 for Speech-language Pre-training for End-to-end Spoken Language Understanding
Figure 2 for Speech-language Pre-training for End-to-end Spoken Language Understanding
Figure 3 for Speech-language Pre-training for End-to-end Spoken Language Understanding
Figure 4 for Speech-language Pre-training for End-to-end Spoken Language Understanding
Viaarxiv icon

Internal Language Model Training for Domain-Adaptive End-to-End Speech Recognition

Feb 02, 2021
Zhong Meng, Naoyuki Kanda, Yashesh Gaur, Sarangarajan Parthasarathy, Eric Sun, Liang Lu, Xie Chen, Jinyu Li, Yifan Gong

Figure 1 for Internal Language Model Training for Domain-Adaptive End-to-End Speech Recognition
Figure 2 for Internal Language Model Training for Domain-Adaptive End-to-End Speech Recognition
Viaarxiv icon