Alert button

Towards End-to-end Unsupervised Speech Recognition

Apr 05, 2022
Alexander H. Liu, Wei-Ning Hsu, Michael Auli, Alexei Baevski

Figure 1 for Towards End-to-end Unsupervised Speech Recognition
Figure 2 for Towards End-to-end Unsupervised Speech Recognition
Figure 3 for Towards End-to-end Unsupervised Speech Recognition
Figure 4 for Towards End-to-end Unsupervised Speech Recognition

Share this with someone who'll enjoy it:

Unsupervised speech recognition has shown great potential to make Automatic Speech Recognition (ASR) systems accessible to every language. However, existing methods still heavily rely on hand-crafted pre-processing. Similar to the trend of making supervised speech recognition end-to-end, we introduce \wvu~which does away with all audio-side pre-processing and improves accuracy through better architecture. In addition, we introduce an auxiliary self-supervised objective that ties model predictions back to the input. Experiments show that \wvu~improves unsupervised recognition results across different languages while being conceptually simpler.

* Preprint  
View paper onarxiv icon

Share this with someone who'll enjoy it: