Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

Picture for Naoyuki Kanda

Separating Long-Form Speech with Group-Wise Permutation Invariant Training


Nov 17, 2021
Wangyou Zhang, Zhuo Chen, Naoyuki Kanda, Shujie Liu, Jinyu Li, Sefik Emre Eskimez, Takuya Yoshioka, Xiong Xiao, Zhong Meng, Yanmin Qian, Furu Wei

* 5 pages, 3 figures, 3 tables, submitted to IEEE ICASSP 2022 

  Access Paper or Ask Questions

WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing


Oct 29, 2021
Sanyuan Chen, Chengyi Wang, Zhengyang Chen, Yu Wu, Shujie Liu, Zhuo Chen, Jinyu Li, Naoyuki Kanda, Takuya Yoshioka, Xiong Xiao, Jian Wu, Long Zhou, Shuo Ren, Yanmin Qian, Yao Qian, Jian Wu, Michael Zeng, Furu Wei


  Access Paper or Ask Questions

VarArray: Array-Geometry-Agnostic Continuous Speech Separation


Oct 26, 2021
Takuya Yoshioka, Xiaofei Wang, Dongmei Wang, Min Tang, Zirun Zhu, Zhuo Chen, Naoyuki Kanda

* 5 pages, 1 figure, 3 tables, submitted to ICASSP 2022; updated reference information of [33] 

  Access Paper or Ask Questions

Internal Language Model Adaptation with Text-Only Data for End-to-End Speech Recognition


Oct 14, 2021
Zhong Meng, Yashesh Gaur, Naoyuki Kanda, Jinyu Li, Xie Chen, Yu Wu, Yifan Gong

* 5 pages, submitted to ICASSP 2022 

  Access Paper or Ask Questions

All-neural beamformer for continuous speech separation


Oct 13, 2021
Zhuohuang Zhang, Takuya Yoshioka, Naoyuki Kanda, Zhuo Chen, Xiaofei Wang, Dongmei Wang, Sefik Emre Eskimez

* 5 pages, 3 figures, 2 tables 

  Access Paper or Ask Questions

Transcribe-to-Diarize: Neural Speaker Diarization for Unlimited Number of Speakers using End-to-End Speaker-Attributed ASR


Oct 07, 2021
Naoyuki Kanda, Xiong Xiao, Yashesh Gaur, Xiaofei Wang, Zhong Meng, Zhuo Chen, Takuya Yoshioka

* Submitted to ICASSP 2022 

  Access Paper or Ask Questions

A Comparative Study of Modular and Joint Approaches for Speaker-Attributed ASR on Monaural Long-Form Audio


Jul 06, 2021
Naoyuki Kanda, Xiong Xiao, Jian Wu, Tianyan Zhou, Yashesh Gaur, Xiaofei Wang, Zhong Meng, Zhuo Chen, Takuya Yoshioka

* Submitted to ASRU 2021 

  Access Paper or Ask Questions

Investigation of Practical Aspects of Single Channel Speech Separation for ASR


Jul 05, 2021
Jian Wu, Zhuo Chen, Sanyuan Chen, Yu Wu, Takuya Yoshioka, Naoyuki Kanda, Shujie Liu, Jinyu Li

* Accepted by Interspeech 2021 

  Access Paper or Ask Questions

Minimum Word Error Rate Training with Language Model Fusion for End-to-End Speech Recognition


Jun 04, 2021
Zhong Meng, Yu Wu, Naoyuki Kanda, Liang Lu, Xie Chen, Guoli Ye, Eric Sun, Jinyu Li, Yifan Gong

* Interspeech 2021, Brno, Czech Republic 
* 5 pages, Interspeech 2021 

  Access Paper or Ask Questions

Large-Scale Pre-Training of End-to-End Multi-Talker ASR for Meeting Transcription with Single Distant Microphone


Apr 12, 2021
Naoyuki Kanda, Guoli Ye, Yu Wu, Yashesh Gaur, Xiaofei Wang, Zhong Meng, Zhuo Chen, Takuya Yoshioka

* Submitted to INTERSPEECH 2021 

  Access Paper or Ask Questions

End-to-End Speaker-Attributed ASR with Transformer


Apr 05, 2021
Naoyuki Kanda, Guoli Ye, Yashesh Gaur, Xiaofei Wang, Zhong Meng, Zhuo Chen, Takuya Yoshioka

* Submitted to INTERSPEECH 2021 

  Access Paper or Ask Questions

Streaming Multi-talker Speech Recognition with Joint Speaker Identification


Apr 05, 2021
Liang Lu, Naoyuki Kanda, Jinyu Li, Yifan Gong

* 5 pages, 2 figures, submitted to Interspeech 2021 

  Access Paper or Ask Questions

Speech-language Pre-training for End-to-end Spoken Language Understanding


Feb 11, 2021
Yao Qian, Ximo Bian, Yu Shi, Naoyuki Kanda, Leo Shen, Zhen Xiao, Michael Zeng


  Access Paper or Ask Questions

Internal Language Model Training for Domain-Adaptive End-to-End Speech Recognition


Feb 02, 2021
Zhong Meng, Naoyuki Kanda, Yashesh Gaur, Sarangarajan Parthasarathy, Eric Sun, Liang Lu, Xie Chen, Jinyu Li, Yifan Gong

* 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toronto, Canada 
* 5 pages, ICASSP 2021 

  Access Paper or Ask Questions

A Review of Speaker Diarization: Recent Advances with Deep Learning


Jan 24, 2021
Tae Jin Park, Naoyuki Kanda, Dimitrios Dimitriadis, Kyu J. Han, Shinji Watanabe, Shrikanth Narayanan


  Access Paper or Ask Questions

Hypothesis Stitcher for End-to-End Speaker-attributed ASR on Long-form Multi-talker Recordings


Jan 06, 2021
Xuankai Chang, Naoyuki Kanda, Yashesh Gaur, Xiaofei Wang, Zhong Meng, Takuya Yoshioka

* Submitted to ICASSP 2021 

  Access Paper or Ask Questions

Streaming end-to-end multi-talker speech recognition


Nov 26, 2020
Liang Lu, Naoyuki Kanda, Jinyu Li, Yifan Gong

* 5 pages, 4 figures 

  Access Paper or Ask Questions

Minimum Bayes Risk Training for End-to-End Speaker-Attributed ASR


Nov 03, 2020
Naoyuki Kanda, Zhong Meng, Liang Lu, Yashesh Gaur, Xiaofei Wang, Zhuo Chen, Takuya Yoshioka

* Submitted to ICASSP 2021. arXiv admin note: text overlap with arXiv:2006.10930, arXiv:2008.04546 

  Access Paper or Ask Questions

Internal Language Model Estimation for Domain-Adaptive End-to-End Speech Recognition


Nov 03, 2020
Zhong Meng, Sarangarajan Parthasarathy, Eric Sun, Yashesh Gaur, Naoyuki Kanda, Liang Lu, Xie Chen, Rui Zhao, Jinyu Li, Yifan Gong

* 2021 IEEE Spoken Language Technology Workshop (SLT) 
* 8 pages, 2 figures, SLT 2021 

  Access Paper or Ask Questions

On Minimum Word Error Rate Training of the Hybrid Autoregressive Transducer


Oct 23, 2020
Liang Lu, Zhong Meng, Naoyuki Kanda, Jinyu Li, Yifan Gong

* 5 pages, submitted to ICASSP 2021 

  Access Paper or Ask Questions

Investigation of End-To-End Speaker-Attributed ASR for Continuous Multi-Talker Recordings


Aug 11, 2020
Naoyuki Kanda, Xuankai Chang, Yashesh Gaur, Xiaofei Wang, Zhong Meng, Zhuo Chen, Takuya Yoshioka


  Access Paper or Ask Questions

Joint Speaker Counting, Speech Recognition, and Speaker Identification for Overlapped Speech of Any Number of Speakers


Jun 19, 2020
Naoyuki Kanda, Yashesh Gaur, Xiaofei Wang, Zhong Meng, Zhuo Chen, Tianyan Zhou, Takuya Yoshioka

* Submitted to INTERSPEECH 2020 

  Access Paper or Ask Questions

CHiME-6 Challenge:Tackling Multispeaker Speech Recognition for Unsegmented Recordings


May 02, 2020
Shinji Watanabe, Michael Mandel, Jon Barker, Emmanuel Vincent, Ashish Arora, Xuankai Chang, Sanjeev Khudanpur, Vimal Manohar, Daniel Povey, Desh Raj, David Snyder, Aswin Shanmugam Subramanian, Jan Trmal, Bar Ben Yair, Christoph Boeddeker, Zhaoheng Ni, Yusuke Fujita, Shota Horiguchi, Naoyuki Kanda, Takuya Yoshioka, Neville Ryant


  Access Paper or Ask Questions

Serialized Output Training for End-to-End Overlapped Speech Recognition


Mar 28, 2020
Naoyuki Kanda, Yashesh Gaur, Xiaofei Wang, Zhong Meng, Takuya Yoshioka

* Submitted to INTERSPEECH 2020 

  Access Paper or Ask Questions

Simultaneous Speech Recognition and Speaker Diarization for Monaural Dialogue Recordings with Target-Speaker Acoustic Models


Sep 17, 2019
Naoyuki Kanda, Shota Horiguchi, Yusuke Fujita, Yawen Xue, Kenji Nagamatsu, Shinji Watanabe

* Accepted to ASRU 2019 

  Access Paper or Ask Questions