Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

Picture for Naoyuki Kanda

Large-Scale Pre-Training of End-to-End Multi-Talker ASR for Meeting Transcription with Single Distant Microphone


Apr 12, 2021
Naoyuki Kanda, Guoli Ye, Yu Wu, Yashesh Gaur, Xiaofei Wang, Zhong Meng, Zhuo Chen, Takuya Yoshioka

* Submitted to INTERSPEECH 2021 

  Access Paper or Ask Questions

End-to-End Speaker-Attributed ASR with Transformer


Apr 05, 2021
Naoyuki Kanda, Guoli Ye, Yashesh Gaur, Xiaofei Wang, Zhong Meng, Zhuo Chen, Takuya Yoshioka

* Submitted to INTERSPEECH 2021 

  Access Paper or Ask Questions

Streaming Multi-talker Speech Recognition with Joint Speaker Identification


Apr 05, 2021
Liang Lu, Naoyuki Kanda, Jinyu Li, Yifan Gong

* 5 pages, 2 figures, submitted to Interspeech 2021 

  Access Paper or Ask Questions

Speech-language Pre-training for End-to-end Spoken Language Understanding


Feb 11, 2021
Yao Qian, Ximo Bian, Yu Shi, Naoyuki Kanda, Leo Shen, Zhen Xiao, Michael Zeng


  Access Paper or Ask Questions

Internal Language Model Training for Domain-Adaptive End-to-End Speech Recognition


Feb 02, 2021
Zhong Meng, Naoyuki Kanda, Yashesh Gaur, Sarangarajan Parthasarathy, Eric Sun, Liang Lu, Xie Chen, Jinyu Li, Yifan Gong

* 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toronto, Canada 
* 5 pages, ICASSP 2021 

  Access Paper or Ask Questions

A Review of Speaker Diarization: Recent Advances with Deep Learning


Jan 24, 2021
Tae Jin Park, Naoyuki Kanda, Dimitrios Dimitriadis, Kyu J. Han, Shinji Watanabe, Shrikanth Narayanan


  Access Paper or Ask Questions

Hypothesis Stitcher for End-to-End Speaker-attributed ASR on Long-form Multi-talker Recordings


Jan 06, 2021
Xuankai Chang, Naoyuki Kanda, Yashesh Gaur, Xiaofei Wang, Zhong Meng, Takuya Yoshioka

* Submitted to ICASSP 2021 

  Access Paper or Ask Questions

Streaming end-to-end multi-talker speech recognition


Nov 26, 2020
Liang Lu, Naoyuki Kanda, Jinyu Li, Yifan Gong

* 5 pages, 4 figures 

  Access Paper or Ask Questions

Minimum Bayes Risk Training for End-to-End Speaker-Attributed ASR


Nov 03, 2020
Naoyuki Kanda, Zhong Meng, Liang Lu, Yashesh Gaur, Xiaofei Wang, Zhuo Chen, Takuya Yoshioka

* Submitted to ICASSP 2021. arXiv admin note: text overlap with arXiv:2006.10930, arXiv:2008.04546 

  Access Paper or Ask Questions

Internal Language Model Estimation for Domain-Adaptive End-to-End Speech Recognition


Nov 03, 2020
Zhong Meng, Sarangarajan Parthasarathy, Eric Sun, Yashesh Gaur, Naoyuki Kanda, Liang Lu, Xie Chen, Rui Zhao, Jinyu Li, Yifan Gong

* 2021 IEEE Spoken Language Technology Workshop (SLT) 
* 8 pages, 2 figures, SLT 2021 

  Access Paper or Ask Questions

On Minimum Word Error Rate Training of the Hybrid Autoregressive Transducer


Oct 23, 2020
Liang Lu, Zhong Meng, Naoyuki Kanda, Jinyu Li, Yifan Gong

* 5 pages, submitted to ICASSP 2021 

  Access Paper or Ask Questions

Investigation of End-To-End Speaker-Attributed ASR for Continuous Multi-Talker Recordings


Aug 11, 2020
Naoyuki Kanda, Xuankai Chang, Yashesh Gaur, Xiaofei Wang, Zhong Meng, Zhuo Chen, Takuya Yoshioka


  Access Paper or Ask Questions

Joint Speaker Counting, Speech Recognition, and Speaker Identification for Overlapped Speech of Any Number of Speakers


Jun 19, 2020
Naoyuki Kanda, Yashesh Gaur, Xiaofei Wang, Zhong Meng, Zhuo Chen, Tianyan Zhou, Takuya Yoshioka

* Submitted to INTERSPEECH 2020 

  Access Paper or Ask Questions

CHiME-6 Challenge:Tackling Multispeaker Speech Recognition for Unsegmented Recordings


May 02, 2020
Shinji Watanabe, Michael Mandel, Jon Barker, Emmanuel Vincent, Ashish Arora, Xuankai Chang, Sanjeev Khudanpur, Vimal Manohar, Daniel Povey, Desh Raj, David Snyder, Aswin Shanmugam Subramanian, Jan Trmal, Bar Ben Yair, Christoph Boeddeker, Zhaoheng Ni, Yusuke Fujita, Shota Horiguchi, Naoyuki Kanda, Takuya Yoshioka, Neville Ryant


  Access Paper or Ask Questions

Serialized Output Training for End-to-End Overlapped Speech Recognition


Mar 28, 2020
Naoyuki Kanda, Yashesh Gaur, Xiaofei Wang, Zhong Meng, Takuya Yoshioka

* Submitted to INTERSPEECH 2020 

  Access Paper or Ask Questions

Simultaneous Speech Recognition and Speaker Diarization for Monaural Dialogue Recordings with Target-Speaker Acoustic Models


Sep 17, 2019
Naoyuki Kanda, Shota Horiguchi, Yusuke Fujita, Yawen Xue, Kenji Nagamatsu, Shinji Watanabe

* Accepted to ASRU 2019 

  Access Paper or Ask Questions

End-to-End Neural Speaker Diarization with Self-attention


Sep 13, 2019
Yusuke Fujita, Naoyuki Kanda, Shota Horiguchi, Yawen Xue, Kenji Nagamatsu, Shinji Watanabe

* Accepted for ASRU 2019 

  Access Paper or Ask Questions

End-to-End Neural Speaker Diarization with Permutation-Free Objectives


Sep 12, 2019
Yusuke Fujita, Naoyuki Kanda, Shota Horiguchi, Kenji Nagamatsu, Shinji Watanabe

* Accepted to INTERSPEECH 2019 

  Access Paper or Ask Questions

Auxiliary Interference Speaker Loss for Target-Speaker Speech Recognition


Jun 26, 2019
Naoyuki Kanda, Shota Horiguchi, Ryoichi Takashima, Yusuke Fujita, Kenji Nagamatsu, Shinji Watanabe

* Accepted to INTERSPEECH 2019 

  Access Paper or Ask Questions

Guided Source Separation Meets a Strong ASR Backend: Hitachi/Paderborn University Joint Investigation for Dinner Party ASR


May 29, 2019
Naoyuki Kanda, Christoph Boeddeker, Jens Heitkaemper, Yusuke Fujita, Shota Horiguchi, Kenji Nagamatsu, Reinhold Haeb-Umbach

* Submitted to INTERSPEECH 2019 

  Access Paper or Ask Questions