Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox


Letter-Based Speech Recognition with Gated ConvNets

Dec 22, 2017
Vitaliy Liptchinsky, Gabriel Synnaeve, Ronan Collobert



In this paper we introduce a new speech recognition system, leveraging a simple letter-based ConvNet acoustic model. The acoustic model requires -- only audio transcription for training -- no alignment annotations, nor any forced alignment step is needed. At inference, our decoder takes only a word list and a language model, and is fed with letter scores from the -- acoustic model -- no phonetic word lexicon is needed. Key ingredients for the acoustic model are Gated Linear Units and high dropout. We show near state-of-the-art results in word error rate on the LibriSpeech corpus using log-mel filterbanks, both on the "clean" and "other" configurations.

* 13 pages.arXiv admin note: text overlap with arXiv:1609.03193 


Share this with someone who'll enjoy it:

   Access Paper Source



Share this with someone who'll enjoy it: