Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox


SpeechBERT: Cross-Modal Pre-trained Language Model for End-to-end Spoken Question Answering

Oct 25, 2019
Yung-Sung Chuang, Chi-Liang Liu, Hung-Yi Lee


Share this with someone who'll enjoy it:


While end-to-end models for spoken language understanding tasks have been explored recently, there is still no end-to-end model for spoken question answering (SQA) tasks, which would be catastrophically influenced by speech recognition errors. Meanwhile, pre-trained language models, such as BERT, have performed successfully in text question answering. To bring this advantage of pre-trained language models into spoken question answering, we propose SpeechBERT, a cross-modal transformer-based pre-trained language model. As the first exploration in end-to-end SQA models, our results matched the performance of conventional approaches that fed with output text from ASR and only slightly fell behind pre-trained language models, showing the potential of end-to-end SQA models.

* Submitted to ICASSP 2020 


   Access Paper Source



Share this with someone who'll enjoy it: