Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:A review of on-device fully neural end-to-end automatic speech recognition algorithms

Dec 19, 2020

Chanwoo Kim, Dhananjaya Gowda, Dongsoo Lee, Jiyeon Kim, Ankur Kumar, Sungsoo Kim, Abhinav Garg, Changwoo Han

Figure 1 for A review of on-device fully neural end-to-end automatic speech recognition algorithms

Figure 2 for A review of on-device fully neural end-to-end automatic speech recognition algorithms

Figure 3 for A review of on-device fully neural end-to-end automatic speech recognition algorithms

Figure 4 for A review of on-device fully neural end-to-end automatic speech recognition algorithms

Share this with someone who'll enjoy it:

Abstract:In this paper, we review various end-to-end automatic speech recognition algorithms and their optimization techniques for on-device applications. Conventional speech recognition systems comprise a large number of discrete components such as an acoustic model, a language model, a pronunciation model, a text-normalizer, an inverse-text normalizer, a decoder based on a Weighted Finite State Transducer (WFST), and so on. To obtain sufficiently high speech recognition accuracy with such conventional speech recognition systems, a very large language model (up to 100 GB) is usually needed. Hence, the corresponding WFST size becomes enormous, which prohibits their on-device implementation. Recently, fully neural network end-to-end speech recognition algorithms have been proposed. Examples include speech recognition systems based on Connectionist Temporal Classification (CTC), Recurrent Neural Network Transducer (RNN-T), Attention-based Encoder-Decoder models (AED), Monotonic Chunk-wise Attention (MoChA), transformer-based speech recognition systems, and so on. These fully neural network-based systems require much smaller memory footprints compared to conventional algorithms, therefore their on-device implementation has become feasible. In this paper, we review such end-to-end speech recognition models. We extensively discuss their structures, performance, and advantages compared to conventional algorithms.

* Accepted as an invited paper to the Asilomar Conference on Signals, Systems, and Computers 2020

View paper on

Share this with someone who'll enjoy it:

Title:A review of on-device fully neural end-to-end automatic speech recognition algorithms

Paper and Code