Alert button

"speech recognition": models, code, and papers
Alert button

Decoder-only Architecture for Speech Recognition with CTC Prompts and Text Data Augmentation

Sep 16, 2023
Emiru Tsunoo, Hayato Futami, Yosuke Kashiwagi, Siddhant Arora, Shinji Watanabe

Viaarxiv icon

Soft Random Sampling: A Theoretical and Empirical Analysis

Nov 24, 2023
Xiaodong Cui, Ashish Mittal, Songtao Lu, Wei Zhang, George Saon, Brian Kingsbury

Viaarxiv icon

A Theory of Unsupervised Speech Recognition

Add code
Bookmark button
Alert button
Jun 09, 2023
Liming Wang, Mark Hasegawa-Johnson, Chang D. Yoo

Figure 1 for A Theory of Unsupervised Speech Recognition
Figure 2 for A Theory of Unsupervised Speech Recognition
Figure 3 for A Theory of Unsupervised Speech Recognition
Figure 4 for A Theory of Unsupervised Speech Recognition
Viaarxiv icon

Replay to Remember: Continual Layer-Specific Fine-tuning for German Speech Recognition

Jul 14, 2023
Theresa Pekarek Rosin, Stefan Wermter

Figure 1 for Replay to Remember: Continual Layer-Specific Fine-tuning for German Speech Recognition
Figure 2 for Replay to Remember: Continual Layer-Specific Fine-tuning for German Speech Recognition
Figure 3 for Replay to Remember: Continual Layer-Specific Fine-tuning for German Speech Recognition
Figure 4 for Replay to Remember: Continual Layer-Specific Fine-tuning for German Speech Recognition
Viaarxiv icon

Federated Representation Learning for Automatic Speech Recognition

Add code
Bookmark button
Alert button
Aug 07, 2023
Guruprasad V Ramesh, Gopinath Chennupati, Milind Rao, Anit Kumar Sahu, Ariya Rastrow, Jasha Droppo

Figure 1 for Federated Representation Learning for Automatic Speech Recognition
Figure 2 for Federated Representation Learning for Automatic Speech Recognition
Figure 3 for Federated Representation Learning for Automatic Speech Recognition
Figure 4 for Federated Representation Learning for Automatic Speech Recognition
Viaarxiv icon

Do VSR Models Generalize Beyond LRS3?

Nov 23, 2023
Yasser Abdelaziz Dahou Djilali, Sanath Narayan, Eustache Le Bihan, Haithem Boussaid, Ebtessam Almazrouei, Merouane Debbah

Viaarxiv icon

Adapting Large Language Model with Speech for Fully Formatted End-to-End Speech Recognition

Add code
Bookmark button
Alert button
Aug 03, 2023
Shaoshi Ling, Yuxuan Hu, Shuangbei Qian, Guoli Ye, Yao Qian, Yifan Gong, Ed Lin, Michael Zeng

Figure 1 for Adapting Large Language Model with Speech for Fully Formatted End-to-End Speech Recognition
Figure 2 for Adapting Large Language Model with Speech for Fully Formatted End-to-End Speech Recognition
Figure 3 for Adapting Large Language Model with Speech for Fully Formatted End-to-End Speech Recognition
Figure 4 for Adapting Large Language Model with Speech for Fully Formatted End-to-End Speech Recognition
Viaarxiv icon

Multilingual self-supervised speech representations improve the speech recognition of low-resource African languages with codeswitching

Nov 25, 2023
Tolúlopé Ògúnrèmí, Christopher D. Manning, Dan Jurafsky

Viaarxiv icon

Bring the Noise: Introducing Noise Robustness to Pretrained Automatic Speech Recognition

Sep 05, 2023
Patrick Eickhoff, Matthias Möller, Theresa Pekarek Rosin, Johannes Twiefel, Stefan Wermter

Figure 1 for Bring the Noise: Introducing Noise Robustness to Pretrained Automatic Speech Recognition
Figure 2 for Bring the Noise: Introducing Noise Robustness to Pretrained Automatic Speech Recognition
Figure 3 for Bring the Noise: Introducing Noise Robustness to Pretrained Automatic Speech Recognition
Figure 4 for Bring the Noise: Introducing Noise Robustness to Pretrained Automatic Speech Recognition
Viaarxiv icon

Whisper-MCE: Whisper Model Finetuned for Better Performance with Mixed Languages

Oct 27, 2023
Peng Xie, XingYuan Liu, ZiWei Chen, Kani Chen, Yang Wang

Figure 1 for Whisper-MCE: Whisper Model Finetuned for Better Performance with Mixed Languages
Figure 2 for Whisper-MCE: Whisper Model Finetuned for Better Performance with Mixed Languages
Viaarxiv icon