Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:End-to-end spoken language understanding using transformer networks and self-supervised pre-trained features

Nov 16, 2020

Edmilson Morais, Hong-Kwang J. Kuo, Samuel Thomas, Zoltan Tuske, Brian Kingsbury

Figure 1 for End-to-end spoken language understanding using transformer networks and self-supervised pre-trained features

Figure 2 for End-to-end spoken language understanding using transformer networks and self-supervised pre-trained features

Figure 3 for End-to-end spoken language understanding using transformer networks and self-supervised pre-trained features

Figure 4 for End-to-end spoken language understanding using transformer networks and self-supervised pre-trained features

Share this with someone who'll enjoy it:

Abstract:Transformer networks and self-supervised pre-training have consistently delivered state-of-art results in the field of natural language processing (NLP); however, their merits in the field of spoken language understanding (SLU) still need further investigation. In this paper we introduce a modular End-to-End (E2E) SLU transformer network based architecture which allows the use of self-supervised pre-trained acoustic features, pre-trained model initialization and multi-task training. Several SLU experiments for predicting intent and entity labels/values using the ATIS dataset are performed. These experiments investigate the interaction of pre-trained model initialization and multi-task training with either traditional filterbank or self-supervised pre-trained acoustic features. Results show not only that self-supervised pre-trained acoustic features outperform filterbank features in almost all the experiments, but also that when these features are used in combination with multi-task training, they almost eliminate the necessity of pre-trained model initialization.

* 5 pages, 3 tables and 1 figure

View paper on

Share this with someone who'll enjoy it:

Title:End-to-end spoken language understanding using transformer networks and self-supervised pre-trained features

Paper and Code