Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Augmentation adversarial training for unsupervised speaker recognition

Aug 09, 2020

Jaesung Huh, Hee Soo Heo, Jingu Kang, Shinji Watanabe, Joon Son Chung

Figure 1 for Augmentation adversarial training for unsupervised speaker recognition

Figure 2 for Augmentation adversarial training for unsupervised speaker recognition

Figure 3 for Augmentation adversarial training for unsupervised speaker recognition

Figure 4 for Augmentation adversarial training for unsupervised speaker recognition

Share this with someone who'll enjoy it:

Abstract:The goal of this work is to train robust speaker recognition models without speaker labels. Recent works on unsupervised speaker representations are based on contrastive learning in which they encourage within-utterance embeddings to be similar and across-utterance embeddings to be dissimilar. However, since the within-utterance segments share the same acoustic characteristics, it is difficult to separate the speaker information from the channel information. To this end, we propose augmentation adversarial training strategy that trains the network to be discriminative for the speaker information, while invariant to the augmentation applied. Since the augmentation simulates the acoustic characteristics, training the network to be invariant to augmentation also encourages the network to be invariant to the channel information in general. Extensive experiments on the VoxCeleb and VOiCES datasets show significant improvements over previous works using self-supervision, and the performance of our self-supervised models far exceed that of humans.

View paper on

Share this with someone who'll enjoy it:

Title:Augmentation adversarial training for unsupervised speaker recognition

Paper and Code