Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Online End-to-End Neural Diarization Handling Overlapping Speech and Flexible Numbers of Speakers

Jan 21, 2021

Yawen Xue, Shota Horiguchi, Yusuke Fujita, Yuki Takashima, Shinji Watanabe, Paola Garcia, Kenji Nagamatsu

Figure 1 for Online End-to-End Neural Diarization Handling Overlapping Speech and Flexible Numbers of Speakers

Figure 2 for Online End-to-End Neural Diarization Handling Overlapping Speech and Flexible Numbers of Speakers

Figure 3 for Online End-to-End Neural Diarization Handling Overlapping Speech and Flexible Numbers of Speakers

Figure 4 for Online End-to-End Neural Diarization Handling Overlapping Speech and Flexible Numbers of Speakers

Share this with someone who'll enjoy it:

Abstract:This paper proposes an online end-to-end diarization that can handle overlapping speech and flexible numbers of speakers. The end-to-end neural speaker diarization (EEND) model has already achieved significant improvement when compared with conventional clustering-based methods. However, the original EEND has two limitations: i) EEND does not perform well in online scenarios; ii) the number of speakers must be fixed in advance. This paper solves both problems by applying a modified extension of the speaker-tracing buffer method that deals with variable numbers of speakers. Experiments on CALLHOME and DIHARD II datasets show that the proposed online method achieves comparable performance to the offline EEND method. Compared with the state-of-the-art online method based on a fully supervised approach (UIS-RNN), the proposed method shows better performance on the DIHARD II dataset.

View paper on

Share this with someone who'll enjoy it:

Title:Online End-to-End Neural Diarization Handling Overlapping Speech and Flexible Numbers of Speakers

Paper and Code