Alert button
Picture for Naoyuki Kanda

Naoyuki Kanda

Alert button

Streaming Multi-Talker ASR with Token-Level Serialized Output Training

Add code
Bookmark button
Alert button
Feb 05, 2022
Naoyuki Kanda, Jian Wu, Yu Wu, Xiong Xiao, Zhong Meng, Xiaofei Wang, Yashesh Gaur, Zhuo Chen, Jinyu Li, Takuya Yoshioka

Figure 1 for Streaming Multi-Talker ASR with Token-Level Serialized Output Training
Figure 2 for Streaming Multi-Talker ASR with Token-Level Serialized Output Training
Figure 3 for Streaming Multi-Talker ASR with Token-Level Serialized Output Training
Figure 4 for Streaming Multi-Talker ASR with Token-Level Serialized Output Training
Viaarxiv icon

Separating Long-Form Speech with Group-Wise Permutation Invariant Training

Add code
Bookmark button
Alert button
Nov 17, 2021
Wangyou Zhang, Zhuo Chen, Naoyuki Kanda, Shujie Liu, Jinyu Li, Sefik Emre Eskimez, Takuya Yoshioka, Xiong Xiao, Zhong Meng, Yanmin Qian, Furu Wei

Figure 1 for Separating Long-Form Speech with Group-Wise Permutation Invariant Training
Figure 2 for Separating Long-Form Speech with Group-Wise Permutation Invariant Training
Figure 3 for Separating Long-Form Speech with Group-Wise Permutation Invariant Training
Figure 4 for Separating Long-Form Speech with Group-Wise Permutation Invariant Training
Viaarxiv icon

WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing

Add code
Bookmark button
Alert button
Oct 29, 2021
Sanyuan Chen, Chengyi Wang, Zhengyang Chen, Yu Wu, Shujie Liu, Zhuo Chen, Jinyu Li, Naoyuki Kanda, Takuya Yoshioka, Xiong Xiao, Jian Wu, Long Zhou, Shuo Ren, Yanmin Qian, Yao Qian, Jian Wu, Michael Zeng, Furu Wei

Figure 1 for WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing
Figure 2 for WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing
Figure 3 for WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing
Figure 4 for WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing
Viaarxiv icon

VarArray: Array-Geometry-Agnostic Continuous Speech Separation

Add code
Bookmark button
Alert button
Oct 26, 2021
Takuya Yoshioka, Xiaofei Wang, Dongmei Wang, Min Tang, Zirun Zhu, Zhuo Chen, Naoyuki Kanda

Figure 1 for VarArray: Array-Geometry-Agnostic Continuous Speech Separation
Figure 2 for VarArray: Array-Geometry-Agnostic Continuous Speech Separation
Figure 3 for VarArray: Array-Geometry-Agnostic Continuous Speech Separation
Viaarxiv icon

Internal Language Model Adaptation with Text-Only Data for End-to-End Speech Recognition

Add code
Bookmark button
Alert button
Oct 14, 2021
Zhong Meng, Yashesh Gaur, Naoyuki Kanda, Jinyu Li, Xie Chen, Yu Wu, Yifan Gong

Figure 1 for Internal Language Model Adaptation with Text-Only Data for End-to-End Speech Recognition
Figure 2 for Internal Language Model Adaptation with Text-Only Data for End-to-End Speech Recognition
Viaarxiv icon

All-neural beamformer for continuous speech separation

Add code
Bookmark button
Alert button
Oct 13, 2021
Zhuohuang Zhang, Takuya Yoshioka, Naoyuki Kanda, Zhuo Chen, Xiaofei Wang, Dongmei Wang, Sefik Emre Eskimez

Figure 1 for All-neural beamformer for continuous speech separation
Figure 2 for All-neural beamformer for continuous speech separation
Figure 3 for All-neural beamformer for continuous speech separation
Figure 4 for All-neural beamformer for continuous speech separation
Viaarxiv icon

Transcribe-to-Diarize: Neural Speaker Diarization for Unlimited Number of Speakers using End-to-End Speaker-Attributed ASR

Add code
Bookmark button
Alert button
Oct 07, 2021
Naoyuki Kanda, Xiong Xiao, Yashesh Gaur, Xiaofei Wang, Zhong Meng, Zhuo Chen, Takuya Yoshioka

Figure 1 for Transcribe-to-Diarize: Neural Speaker Diarization for Unlimited Number of Speakers using End-to-End Speaker-Attributed ASR
Figure 2 for Transcribe-to-Diarize: Neural Speaker Diarization for Unlimited Number of Speakers using End-to-End Speaker-Attributed ASR
Figure 3 for Transcribe-to-Diarize: Neural Speaker Diarization for Unlimited Number of Speakers using End-to-End Speaker-Attributed ASR
Viaarxiv icon