Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Multi-Target Filter and Detector for Speaker Diarization

Mar 30, 2022

Chin-Yi Cheng, Hung-Shin Lee, Yu Tsao, Hsin-Min Wang

Figure 1 for Multi-Target Filter and Detector for Speaker Diarization

Figure 2 for Multi-Target Filter and Detector for Speaker Diarization

Figure 3 for Multi-Target Filter and Detector for Speaker Diarization

Figure 4 for Multi-Target Filter and Detector for Speaker Diarization

Share this with someone who'll enjoy it:

Abstract:A good representation of a target speaker usually helps to extract important information about the speaker and detect the corresponding temporal regions in a multi-speaker conversation. In this paper, we propose a neural architecture that simultaneously extracts speaker embeddings consistent with the speaker diarization objective and detects the presence of each speaker frame by frame, regardless of the number of speakers in the conversation. To this end, a residual network (ResNet) and a dual-path recurrent neural network (DPRNN) are integrated into a unified structure. When tested on the 2-speaker CALLHOME corpus, our proposed model outperforms most methods published so far. Evaluated in a more challenging case of concurrent speakers ranging from two to seven, our system also achieves relative diarization error rate reductions of 26.35% and 6.4% over two typical baselines, namely the traditional x-vector clustering system and the attention-based system.

* Submitted to Interspeech 2022

View paper on

Share this with someone who'll enjoy it:

Title:Multi-Target Filter and Detector for Speaker Diarization

Paper and Code