Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:MLP-ASR: Sequence-length agnostic all-MLP architectures for speech recognition

Feb 17, 2022

Jin Sakuma, Tatsuya Komatsu, Robin Scheibler

Figure 1 for MLP-ASR: Sequence-length agnostic all-MLP architectures for speech recognition

Figure 2 for MLP-ASR: Sequence-length agnostic all-MLP architectures for speech recognition

Figure 3 for MLP-ASR: Sequence-length agnostic all-MLP architectures for speech recognition

Figure 4 for MLP-ASR: Sequence-length agnostic all-MLP architectures for speech recognition

Share this with someone who'll enjoy it:

Abstract:We propose multi-layer perceptron (MLP)-based architectures suitable for variable length input. MLP-based architectures, recently proposed for image classification, can only be used for inputs of a fixed, pre-defined size. However, many types of data are naturally variable in length, for example, acoustic signals. We propose three approaches to extend MLP-based architectures for use with sequences of arbitrary length. The first one uses a circular convolution applied in the Fourier domain, the second applies a depthwise convolution, and the final relies on a shift operation. We evaluate the proposed architectures on an automatic speech recognition task with the Librispeech and Tedlium2 corpora. The best proposed MLP-based architectures improves WER by 1.0 / 0.9%, 0.9 / 0.5% on Librispeech dev-clean/dev-other, test-clean/test-other set, and 0.8 / 1.1% on Tedlium2 dev/test set using 86.4% the size of self-attention-based architecture.

* 8 pages, 4 figures

View paper on

Share this with someone who'll enjoy it:

Title:MLP-ASR: Sequence-length agnostic all-MLP architectures for speech recognition

Paper and Code