Alert button

"speech recognition": models, code, and papers
Alert button

End-to-End Lip Reading in Romanian with Cross-Lingual Domain Adaptation and Lateral Inhibition

Oct 07, 2023
Emilian-Claudiu Mănescu, Răzvan-Alexandru Smădu, Andrei-Marius Avram, Dumitru-Clementin Cercel, Florin Pop

Viaarxiv icon

Challenges and Insights: Exploring 3D Spatial Features and Complex Networks on the MISP Dataset

Oct 05, 2023
Yiwen Shao

Figure 1 for Challenges and Insights: Exploring 3D Spatial Features and Complex Networks on the MISP Dataset
Figure 2 for Challenges and Insights: Exploring 3D Spatial Features and Complex Networks on the MISP Dataset
Figure 3 for Challenges and Insights: Exploring 3D Spatial Features and Complex Networks on the MISP Dataset
Figure 4 for Challenges and Insights: Exploring 3D Spatial Features and Complex Networks on the MISP Dataset
Viaarxiv icon

Updated Corpora and Benchmarks for Long-Form Speech Recognition

Sep 26, 2023
Jennifer Drexler Fox, Desh Raj, Natalie Delworth, Quinn McNamara, Corey Miller, Migüel Jetté

Viaarxiv icon

Factorised Speaker-environment Adaptive Training of Conformer Speech Recognition Systems

Jun 26, 2023
Jiajun Deng, Guinan Li, Xurong Xie, Zengrui Jin, Mingyu Cui, Tianzi Wang, Shujie Hu, Mengzhe Geng, Xunying Liu

Figure 1 for Factorised Speaker-environment Adaptive Training of Conformer Speech Recognition Systems
Figure 2 for Factorised Speaker-environment Adaptive Training of Conformer Speech Recognition Systems
Figure 3 for Factorised Speaker-environment Adaptive Training of Conformer Speech Recognition Systems
Figure 4 for Factorised Speaker-environment Adaptive Training of Conformer Speech Recognition Systems
Viaarxiv icon

ed-cec: improving rare word recognition using asr postprocessing based on error detection and context-aware error correction

Oct 08, 2023
Jiajun He, Zekun Yang, Tomoki Toda

Figure 1 for ed-cec: improving rare word recognition using asr postprocessing based on error detection and context-aware error correction
Figure 2 for ed-cec: improving rare word recognition using asr postprocessing based on error detection and context-aware error correction
Figure 3 for ed-cec: improving rare word recognition using asr postprocessing based on error detection and context-aware error correction
Figure 4 for ed-cec: improving rare word recognition using asr postprocessing based on error detection and context-aware error correction
Viaarxiv icon

Improving Fairness and Robustness in End-to-End Speech Recognition through unsupervised clustering

Jun 06, 2023
Irina-Elena Veliche, Pascale Fung

Figure 1 for Improving Fairness and Robustness in End-to-End Speech Recognition through unsupervised clustering
Figure 2 for Improving Fairness and Robustness in End-to-End Speech Recognition through unsupervised clustering
Figure 3 for Improving Fairness and Robustness in End-to-End Speech Recognition through unsupervised clustering
Figure 4 for Improving Fairness and Robustness in End-to-End Speech Recognition through unsupervised clustering
Viaarxiv icon

FunCodec: A Fundamental, Reproducible and Integrable Open-source Toolkit for Neural Speech Codec

Sep 14, 2023
Zhihao Du, Shiliang Zhang, Kai Hu, Siqi Zheng

Figure 1 for FunCodec: A Fundamental, Reproducible and Integrable Open-source Toolkit for Neural Speech Codec
Figure 2 for FunCodec: A Fundamental, Reproducible and Integrable Open-source Toolkit for Neural Speech Codec
Figure 3 for FunCodec: A Fundamental, Reproducible and Integrable Open-source Toolkit for Neural Speech Codec
Figure 4 for FunCodec: A Fundamental, Reproducible and Integrable Open-source Toolkit for Neural Speech Codec
Viaarxiv icon

Exploring Energy-based Language Models with Different Architectures and Training Methods for Speech Recognition

May 26, 2023
Hong Liu, Zhaobiao Lv, Zhijian Ou, Wenbo Zhao, Qing Xiao

Figure 1 for Exploring Energy-based Language Models with Different Architectures and Training Methods for Speech Recognition
Figure 2 for Exploring Energy-based Language Models with Different Architectures and Training Methods for Speech Recognition
Figure 3 for Exploring Energy-based Language Models with Different Architectures and Training Methods for Speech Recognition
Figure 4 for Exploring Energy-based Language Models with Different Architectures and Training Methods for Speech Recognition
Viaarxiv icon

Unintended Memorization in Large ASR Models, and How to Mitigate It

Oct 18, 2023
Lun Wang, Om Thakkar, Rajiv Mathews

Viaarxiv icon

Foundation Model Assisted Automatic Speech Emotion Recognition: Transcribing, Annotating, and Augmenting

Sep 15, 2023
Tiantian Feng, Shrikanth Narayanan

Figure 1 for Foundation Model Assisted Automatic Speech Emotion Recognition: Transcribing, Annotating, and Augmenting
Figure 2 for Foundation Model Assisted Automatic Speech Emotion Recognition: Transcribing, Annotating, and Augmenting
Figure 3 for Foundation Model Assisted Automatic Speech Emotion Recognition: Transcribing, Annotating, and Augmenting
Figure 4 for Foundation Model Assisted Automatic Speech Emotion Recognition: Transcribing, Annotating, and Augmenting
Viaarxiv icon