Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Automatic Disfluency Detection from Untranscribed Speech

Nov 01, 2023

Amrit Romana, Kazuhito Koishida, Emily Mower Provost

Figure 1 for Automatic Disfluency Detection from Untranscribed Speech

Figure 2 for Automatic Disfluency Detection from Untranscribed Speech

Figure 3 for Automatic Disfluency Detection from Untranscribed Speech

Figure 4 for Automatic Disfluency Detection from Untranscribed Speech

Share this with someone who'll enjoy it:

Abstract:Speech disfluencies, such as filled pauses or repetitions, are disruptions in the typical flow of speech. Stuttering is a speech disorder characterized by a high rate of disfluencies, but all individuals speak with some disfluencies and the rates of disfluencies may by increased by factors such as cognitive load. Clinically, automatic disfluency detection may help in treatment planning for individuals who stutter. Outside of the clinic, automatic disfluency detection may serve as a pre-processing step to improve natural language understanding in downstream applications. With this wide range of applications in mind, we investigate language, acoustic, and multimodal methods for frame-level automatic disfluency detection and categorization. Each of these methods relies on audio as an input. First, we evaluate several automatic speech recognition (ASR) systems in terms of their ability to transcribe disfluencies, measured using disfluency error rates. We then use these ASR transcripts as input to a language-based disfluency detection model. We find that disfluency detection performance is largely limited by the quality of transcripts and alignments. We find that an acoustic-based approach that does not require transcription as an intermediate step outperforms the ASR language approach. Finally, we present multimodal architectures which we find improve disfluency detection performance over the unimodal approaches. Ultimately, this work introduces novel approaches for automatic frame-level disfluency and categorization. In the long term, this will help researchers incorporate automatic disfluency detection into a range of applications.

View paper on

Share this with someone who'll enjoy it:

Title:Automatic Disfluency Detection from Untranscribed Speech

Paper and Code