Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Neural Zero-Inflated Quality Estimation Model For Automatic Speech Recognition System

Oct 03, 2019

Kai Fan, Jiayi Wang, Bo Li, Boxing Chen, Niyu Ge

Figure 1 for Neural Zero-Inflated Quality Estimation Model For Automatic Speech Recognition System

Figure 2 for Neural Zero-Inflated Quality Estimation Model For Automatic Speech Recognition System

Figure 3 for Neural Zero-Inflated Quality Estimation Model For Automatic Speech Recognition System

Figure 4 for Neural Zero-Inflated Quality Estimation Model For Automatic Speech Recognition System

Share this with someone who'll enjoy it:

Abstract:The performances of automatic speech recognition (ASR) systems are usually evaluated by the metric word error rate (WER) when the manually transcribed data are provided, which are, however, expensively available in the real scenario. In addition, the empirical distribution of WER for most ASR systems usually tends to put a significant mass near zero, making it difficult to simulate with a single continuous distribution. In order to address the two issues of ASR quality estimation (QE), we propose a novel neural zero-inflated model to predict the WER of the ASR result without transcripts. We design a neural zero-inflated beta regression on top of a bidirectional transformer language model conditional on speech features (speech-BERT). We adopt the pre-training strategy of token level mask language modeling for speech-BERT as well, and further fine-tune with our zero-inflated layer for the mixture of discrete and continuous outputs. The experimental results show that our approach achieves better performance on WER prediction in the metrics of Pearson and MAE, compared with most existed quality estimation algorithms for ASR or machine translation.

View paper on

Share this with someone who'll enjoy it:

Title:Neural Zero-Inflated Quality Estimation Model For Automatic Speech Recognition System

Paper and Code