Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Fully Learnable Front-End for Multi-Channel Acoustic Modeling using Semi-Supervised Learning

Feb 01, 2020

Sanna Wager, Aparna Khare, Minhua Wu, Kenichi Kumatani, Shiva Sundaram

Figure 1 for Fully Learnable Front-End for Multi-Channel Acoustic Modeling using Semi-Supervised Learning

Figure 2 for Fully Learnable Front-End for Multi-Channel Acoustic Modeling using Semi-Supervised Learning

Figure 3 for Fully Learnable Front-End for Multi-Channel Acoustic Modeling using Semi-Supervised Learning

Figure 4 for Fully Learnable Front-End for Multi-Channel Acoustic Modeling using Semi-Supervised Learning

Share this with someone who'll enjoy it:

Abstract:In this work, we investigated the teacher-student training paradigm to train a fully learnable multi-channel acoustic model for far-field automatic speech recognition (ASR). Using a large offline teacher model trained on beamformed audio, we trained a simpler multi-channel student acoustic model used in the speech recognition system. For the student, both multi-channel feature extraction layers and the higher classification layers were jointly trained using the logits from the teacher model. In our experiments, compared to a baseline model trained on about 600 hours of transcribed data, a relative word-error rate (WER) reduction of about 27.3% was achieved when using an additional 1800 hours of untranscribed data. We also investigated the benefit of pre-training the multi-channel front end to output the beamformed log-mel filter bank energies (LFBE) using L2 loss. We find that pre-training improves the word error rate by 10.7% when compared to a multi-channel model directly initialized with a beamformer and mel-filter bank coefficients for the front end. Finally, combining pre-training and teacher-student training produces a WER reduction of 31% compared to our baseline.

* To appear in ICASSP 2020

View paper on

Share this with someone who'll enjoy it:

Title:Fully Learnable Front-End for Multi-Channel Acoustic Modeling using Semi-Supervised Learning

Paper and Code