Alert button

"speech recognition": models, code, and papers
Alert button

KinSPEAK: Improving speech recognition for Kinyarwanda via semi-supervised learning methods

Aug 23, 2023
Antoine Nzeyimana

Figure 1 for KinSPEAK: Improving speech recognition for Kinyarwanda via semi-supervised learning methods
Figure 2 for KinSPEAK: Improving speech recognition for Kinyarwanda via semi-supervised learning methods
Figure 3 for KinSPEAK: Improving speech recognition for Kinyarwanda via semi-supervised learning methods
Figure 4 for KinSPEAK: Improving speech recognition for Kinyarwanda via semi-supervised learning methods
Viaarxiv icon

Generative Speech Recognition Error Correction with Large Language Models and Task-Activating Prompting

Add code
Bookmark button
Alert button
Oct 10, 2023
Chao-Han Huck Yang, Yile Gu, Yi-Chieh Liu, Shalini Ghosh, Ivan Bulyko, Andreas Stolcke

Figure 1 for Generative Speech Recognition Error Correction with Large Language Models and Task-Activating Prompting
Figure 2 for Generative Speech Recognition Error Correction with Large Language Models and Task-Activating Prompting
Figure 3 for Generative Speech Recognition Error Correction with Large Language Models and Task-Activating Prompting
Figure 4 for Generative Speech Recognition Error Correction with Large Language Models and Task-Activating Prompting
Viaarxiv icon

Unimodal Aggregation for CTC-based Speech Recognition

Add code
Bookmark button
Alert button
Sep 15, 2023
Ying Fang, Xiaofei Li

Figure 1 for Unimodal Aggregation for CTC-based Speech Recognition
Figure 2 for Unimodal Aggregation for CTC-based Speech Recognition
Figure 3 for Unimodal Aggregation for CTC-based Speech Recognition
Figure 4 for Unimodal Aggregation for CTC-based Speech Recognition
Viaarxiv icon

Extending Whisper with prompt tuning to target-speaker ASR

Dec 13, 2023
Hao Ma, Zhiyuan Peng, Mingjie Shao, Jing Li, Ju Liu

Viaarxiv icon

Improved Long-Form Speech Recognition by Jointly Modeling the Primary and Non-primary Speakers

Dec 18, 2023
Guru Prakash Arumugam, Shuo-yiin Chang, Tara N. Sainath, Rohit Prabhavalkar, Quan Wang, Shaan Bijwadia

Viaarxiv icon

Generative linguistic representation for spoken language identification

Dec 18, 2023
Peng Shen, Xuguang Lu, Hisashi Kawai

Viaarxiv icon

Exploring Speech Recognition, Translation, and Understanding with Discrete Speech Units: A Comparative Study

Add code
Bookmark button
Alert button
Sep 27, 2023
Xuankai Chang, Brian Yan, Kwanghee Choi, Jeeweon Jung, Yichen Lu, Soumi Maiti, Roshan Sharma, Jiatong Shi, Jinchuan Tian, Shinji Watanabe, Yuya Fujita, Takashi Maekaku, Pengcheng Guo, Yao-Fei Cheng, Pavel Denisov, Kohei Saijo, Hsiu-Hsuan Wang

Figure 1 for Exploring Speech Recognition, Translation, and Understanding with Discrete Speech Units: A Comparative Study
Figure 2 for Exploring Speech Recognition, Translation, and Understanding with Discrete Speech Units: A Comparative Study
Figure 3 for Exploring Speech Recognition, Translation, and Understanding with Discrete Speech Units: A Comparative Study
Figure 4 for Exploring Speech Recognition, Translation, and Understanding with Discrete Speech Units: A Comparative Study
Viaarxiv icon

Towards Probing Contact Center Large Language Models

Dec 26, 2023
Varun Nathan, Ayush Kumar, Digvijay Ingle, Jithendra Vepa

Viaarxiv icon

Generative error correction for code-switching speech recognition using large language models

Add code
Bookmark button
Alert button
Oct 17, 2023
Chen Chen, Yuchen Hu, Chao-Han Huck Yang, Hexin Liu, Sabato Marco Siniscalchi, Eng Siong Chng

Figure 1 for Generative error correction for code-switching speech recognition using large language models
Figure 2 for Generative error correction for code-switching speech recognition using large language models
Figure 3 for Generative error correction for code-switching speech recognition using large language models
Figure 4 for Generative error correction for code-switching speech recognition using large language models
Viaarxiv icon

Kid-Whisper: Towards Bridging the Performance Gap in Automatic Speech Recognition for Children VS. Adults

Add code
Bookmark button
Alert button
Sep 18, 2023
Ahmed Adel Attia, Jing Liu, Wei Ai, Dorottya Demszky, Carol Espy-Wilson

Figure 1 for Kid-Whisper: Towards Bridging the Performance Gap in Automatic Speech Recognition for Children VS. Adults
Figure 2 for Kid-Whisper: Towards Bridging the Performance Gap in Automatic Speech Recognition for Children VS. Adults
Figure 3 for Kid-Whisper: Towards Bridging the Performance Gap in Automatic Speech Recognition for Children VS. Adults
Figure 4 for Kid-Whisper: Towards Bridging the Performance Gap in Automatic Speech Recognition for Children VS. Adults
Viaarxiv icon