Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Study of Semi-supervised Approaches to Improving English-Mandarin Code-Switching Speech Recognition

Jun 16, 2018

Pengcheng Guo, Haihua Xu, Lei Xie, Eng Siong Chng

Figure 1 for Study of Semi-supervised Approaches to Improving English-Mandarin Code-Switching Speech Recognition

Figure 2 for Study of Semi-supervised Approaches to Improving English-Mandarin Code-Switching Speech Recognition

Figure 3 for Study of Semi-supervised Approaches to Improving English-Mandarin Code-Switching Speech Recognition

Figure 4 for Study of Semi-supervised Approaches to Improving English-Mandarin Code-Switching Speech Recognition

Share this with someone who'll enjoy it:

Abstract:In this paper, we present our overall efforts to improve the performance of a code-switching speech recognition system using semi-supervised training methods from lexicon learning to acoustic modeling, on the South East Asian Mandarin-English (SEAME) data. We first investigate semi-supervised lexicon learning approach to adapt the canonical lexicon, which is meant to alleviate the heavily accented pronunciation issue within the code-switching conversation of the local area. As a result, the learned lexicon yields improved performance. Furthermore, we attempt to use semi-supervised training to deal with those transcriptions that are highly mismatched between human transcribers and ASR system. Specifically, we conduct semi-supervised training assuming those poorly transcribed data as unsupervised data. We found the semi-supervised acoustic modeling can lead to improved results. Finally, to make up for the limitation of the conventional n-gram language models due to data sparsity issue, we perform lattice rescoring using neural network language models, and significant WER reduction is obtained.

* 5pages, 3 figures, INTERSPEECH 2018

View paper on

Share this with someone who'll enjoy it:

Title:Study of Semi-supervised Approaches to Improving English-Mandarin Code-Switching Speech Recognition

Paper and Code