Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Multi-Task Self-Supervised Learning for Disfluency Detection

Aug 15, 2019

Shaolei Wang, Wanxiang Che, Qi Liu, Pengda Qin, Ting Liu, William Yang Wang

Figure 1 for Multi-Task Self-Supervised Learning for Disfluency Detection

Figure 2 for Multi-Task Self-Supervised Learning for Disfluency Detection

Figure 3 for Multi-Task Self-Supervised Learning for Disfluency Detection

Figure 4 for Multi-Task Self-Supervised Learning for Disfluency Detection

Share this with someone who'll enjoy it:

Abstract:Most existing approaches to disfluency detection heavily rely on human-annotated data, which is expensive to obtain in practice. To tackle the training data bottleneck, we investigate methods for combining multiple self-supervised tasks-i.e., supervised tasks where data can be collected without manual labeling. First, we construct large-scale pseudo training data by randomly adding or deleting words from unlabeled news data, and propose two self-supervised pre-training tasks: (i) tagging task to detect the added noisy words. (ii) sentence classification to distinguish original sentences from grammatically-incorrect sentences. We then combine these two tasks to jointly train a network. The pre-trained network is then fine-tuned using human-annotated disfluency detection training data. Experimental results on the commonly used English Switchboard test set show that our approach can achieve competitive performance compared to the previous systems (trained using the full dataset) by using less than 1% (1000 sentences) of the training data. Our method trained on the full dataset significantly outperforms previous methods, reducing the error by 21% on English Switchboard.

View paper on

Share this with someone who'll enjoy it:

Title:Multi-Task Self-Supervised Learning for Disfluency Detection

Paper and Code