Alert button

"speech recognition": models, code, and papers
Alert button

A Novel Self-training Approach for Low-resource Speech Recognition

Aug 10, 2023
Satwinder Singh, Feng Hou, Ruili Wang

Figure 1 for A Novel Self-training Approach for Low-resource Speech Recognition
Figure 2 for A Novel Self-training Approach for Low-resource Speech Recognition
Figure 3 for A Novel Self-training Approach for Low-resource Speech Recognition
Figure 4 for A Novel Self-training Approach for Low-resource Speech Recognition
Viaarxiv icon

ASTER: Automatic Speech Recognition System Accessibility Testing for Stutterers

Aug 30, 2023
Yi Liu, Yuekang Li, Gelei Deng, Felix Juefei-Xu, Yao Du, Cen Zhang, Chengwei Liu, Yeting Li, Lei Ma, Yang Liu

Figure 1 for ASTER: Automatic Speech Recognition System Accessibility Testing for Stutterers
Figure 2 for ASTER: Automatic Speech Recognition System Accessibility Testing for Stutterers
Figure 3 for ASTER: Automatic Speech Recognition System Accessibility Testing for Stutterers
Figure 4 for ASTER: Automatic Speech Recognition System Accessibility Testing for Stutterers
Viaarxiv icon

Integrating Emotion Recognition with Speech Recognition and Speaker Diarisation for Conversations

Add code
Bookmark button
Alert button
Aug 14, 2023
Wen Wu, Chao Zhang, Philip C. Woodland

Figure 1 for Integrating Emotion Recognition with Speech Recognition and Speaker Diarisation for Conversations
Figure 2 for Integrating Emotion Recognition with Speech Recognition and Speaker Diarisation for Conversations
Figure 3 for Integrating Emotion Recognition with Speech Recognition and Speaker Diarisation for Conversations
Figure 4 for Integrating Emotion Recognition with Speech Recognition and Speaker Diarisation for Conversations
Viaarxiv icon

ML-LMCL: Mutual Learning and Large-Margin Contrastive Learning for Improving ASR Robustness in Spoken Language Understanding

Nov 19, 2023
Xuxin Cheng, Bowen Cao, Qichen Ye, Zhihong Zhu, Hongxiang Li, Yuexian Zou

Viaarxiv icon

Lip2Vec: Efficient and Robust Visual Speech Recognition via Latent-to-Latent Visual to Audio Representation Mapping

Add code
Bookmark button
Alert button
Aug 11, 2023
Yasser Abdelaziz Dahou Djilali, Sanath Narayan, Haithem Boussaid, Ebtessam Almazrouei, Merouane Debbah

Figure 1 for Lip2Vec: Efficient and Robust Visual Speech Recognition via Latent-to-Latent Visual to Audio Representation Mapping
Figure 2 for Lip2Vec: Efficient and Robust Visual Speech Recognition via Latent-to-Latent Visual to Audio Representation Mapping
Figure 3 for Lip2Vec: Efficient and Robust Visual Speech Recognition via Latent-to-Latent Visual to Audio Representation Mapping
Figure 4 for Lip2Vec: Efficient and Robust Visual Speech Recognition via Latent-to-Latent Visual to Audio Representation Mapping
Viaarxiv icon

BUT CHiME-7 system description

Add code
Bookmark button
Alert button
Oct 18, 2023
Martin Karafiát, Karel Veselý, Igor Szöke, Ladislav Mošner, Karel Beneš, Marcin Witkowski, Germán Barchi, Leonardo Pepino

Viaarxiv icon

Zero-shot audio captioning with audio-language model guidance and audio context keywords

Add code
Bookmark button
Alert button
Nov 14, 2023
Leonard Salewski, Stefan Fauth, A. Sophia Koepke, Zeynep Akata

Figure 1 for Zero-shot audio captioning with audio-language model guidance and audio context keywords
Figure 2 for Zero-shot audio captioning with audio-language model guidance and audio context keywords
Figure 3 for Zero-shot audio captioning with audio-language model guidance and audio context keywords
Viaarxiv icon

Retrieve and Copy: Scaling ASR Personalization to Large Catalogs

Nov 14, 2023
Sai Muralidhar Jayanthi, Devang Kulshreshtha, Saket Dingliwal, Srikanth Ronanki, Sravan Bodapati

Figure 1 for Retrieve and Copy: Scaling ASR Personalization to Large Catalogs
Figure 2 for Retrieve and Copy: Scaling ASR Personalization to Large Catalogs
Figure 3 for Retrieve and Copy: Scaling ASR Personalization to Large Catalogs
Figure 4 for Retrieve and Copy: Scaling ASR Personalization to Large Catalogs
Viaarxiv icon

FunASR: A Fundamental End-to-End Speech Recognition Toolkit

Add code
Bookmark button
Alert button
May 18, 2023
Zhifu Gao, Zerui Li, Jiaming Wang, Haoneng Luo, Xian Shi, Mengzhe Chen, Yabin Li, Lingyun Zuo, Zhihao Du, Zhangyu Xiao, Shiliang Zhang

Figure 1 for FunASR: A Fundamental End-to-End Speech Recognition Toolkit
Figure 2 for FunASR: A Fundamental End-to-End Speech Recognition Toolkit
Figure 3 for FunASR: A Fundamental End-to-End Speech Recognition Toolkit
Figure 4 for FunASR: A Fundamental End-to-End Speech Recognition Toolkit
Viaarxiv icon

Conformer-based Target-Speaker Automatic Speech Recognition for Single-Channel Audio

Add code
Bookmark button
Alert button
Aug 09, 2023
Yang Zhang, Krishna C. Puvvada, Vitaly Lavrukhin, Boris Ginsburg

Figure 1 for Conformer-based Target-Speaker Automatic Speech Recognition for Single-Channel Audio
Figure 2 for Conformer-based Target-Speaker Automatic Speech Recognition for Single-Channel Audio
Figure 3 for Conformer-based Target-Speaker Automatic Speech Recognition for Single-Channel Audio
Figure 4 for Conformer-based Target-Speaker Automatic Speech Recognition for Single-Channel Audio
Viaarxiv icon