Alert button

"speech recognition": models, code, and papers
Alert button

Training dynamic models using early exits for automatic speech recognition on resource-constrained devices

Sep 18, 2023
George August Wright, Umberto Cappellazzo, Salah Zaiem, Desh Raj, Lucas Ondel Yang, Daniele Falavigna, Alessio Brutti

Figure 1 for Training dynamic models using early exits for automatic speech recognition on resource-constrained devices
Figure 2 for Training dynamic models using early exits for automatic speech recognition on resource-constrained devices
Figure 3 for Training dynamic models using early exits for automatic speech recognition on resource-constrained devices
Figure 4 for Training dynamic models using early exits for automatic speech recognition on resource-constrained devices
Viaarxiv icon

Soft Random Sampling: A Theoretical and Empirical Analysis

Nov 21, 2023
Xiaodong Cui, Ashish Mittal, Songtao Lu, Wei Zhang, George Saon, Brian Kingsbury

Viaarxiv icon

HyPoradise: An Open Baseline for Generative Speech Recognition with Large Language Models

Sep 27, 2023
Chen Chen, Yuchen Hu, Chao-Han Huck Yang, Sabato Macro Siniscalchi, Pin-Yu Chen, Eng Siong Chng

Figure 1 for HyPoradise: An Open Baseline for Generative Speech Recognition with Large Language Models
Figure 2 for HyPoradise: An Open Baseline for Generative Speech Recognition with Large Language Models
Figure 3 for HyPoradise: An Open Baseline for Generative Speech Recognition with Large Language Models
Figure 4 for HyPoradise: An Open Baseline for Generative Speech Recognition with Large Language Models
Viaarxiv icon

Improving Large-scale Deep Biasing with Phoneme Features and Text-only Data in Streaming Transducer

Nov 15, 2023
Jin Qiu, Lu Huang, Boyu Li, Jun Zhang, Lu Lu, Zejun Ma

Viaarxiv icon

Convoifilter: A case study of doing cocktail party speech recognition

Aug 22, 2023
Thai-Binh Nguyen, Alexander Waibel

Figure 1 for Convoifilter: A case study of doing cocktail party speech recognition
Viaarxiv icon

Whisper-MCE: Whisper Model Finetuned for Better Performance with Mixed Languages

Oct 27, 2023
Peng Xie, XingYuan Liu, ZiWei Chen, Kani Chen, Yang Wang

Figure 1 for Whisper-MCE: Whisper Model Finetuned for Better Performance with Mixed Languages
Figure 2 for Whisper-MCE: Whisper Model Finetuned for Better Performance with Mixed Languages
Viaarxiv icon

Investigating Weight-Perturbed Deep Neural Networks With Application in Iris Presentation Attack Detection

Nov 22, 2023
Renu Sharma, Redwan Sony, Arun Ross

Viaarxiv icon

D4AM: A General Denoising Framework for Downstream Acoustic Models

Nov 28, 2023
Chi-Chang Lee, Yu Tsao, Hsin-Min Wang, Chu-Song Chen

Viaarxiv icon

Discrete Audio Representation as an Alternative to Mel-Spectrograms for Speaker and Speech Recognition

Sep 19, 2023
Krishna C. Puvvada, Nithin Rao Koluguri, Kunal Dhawan, Jagadeesh Balam, Boris Ginsburg

Viaarxiv icon

Indonesian Automatic Speech Recognition with XLSR-53

Aug 20, 2023
Panji Arisaputra, Amalia Zahra

Figure 1 for Indonesian Automatic Speech Recognition with XLSR-53
Figure 2 for Indonesian Automatic Speech Recognition with XLSR-53
Figure 3 for Indonesian Automatic Speech Recognition with XLSR-53
Figure 4 for Indonesian Automatic Speech Recognition with XLSR-53
Viaarxiv icon