Alert button

"speech recognition": models, code, and papers
Alert button

End-to-end Transfer Learning for Speaker-independent Cross-language Speech Emotion Recognition

Nov 22, 2023
Duowei Tang, Peter Kuppens, Luc Geurts, Toon van Waterschoot

Viaarxiv icon

One model to rule them all ? Towards End-to-End Joint Speaker Diarization and Speech Recognition

Oct 02, 2023
Samuele Cornell, Jee-weon Jung, Shinji Watanabe, Stefano Squartini

Figure 1 for One model to rule them all ? Towards End-to-End Joint Speaker Diarization and Speech Recognition
Figure 2 for One model to rule them all ? Towards End-to-End Joint Speaker Diarization and Speech Recognition
Figure 3 for One model to rule them all ? Towards End-to-End Joint Speaker Diarization and Speech Recognition
Viaarxiv icon

End-to-End Speech-to-Text Translation: A Survey

Dec 02, 2023
Nivedita Sethiya, Chandresh Kumar Maurya

Viaarxiv icon

Weak Alignment Supervision from Hybrid Model Improves End-to-end ASR

Nov 30, 2023
Jintao Jiang, Yingbo Gao, Zoltan Tuske

Figure 1 for Weak Alignment Supervision from Hybrid Model Improves End-to-end ASR
Figure 2 for Weak Alignment Supervision from Hybrid Model Improves End-to-end ASR
Figure 3 for Weak Alignment Supervision from Hybrid Model Improves End-to-end ASR
Figure 4 for Weak Alignment Supervision from Hybrid Model Improves End-to-end ASR
Viaarxiv icon

AudioFool: Fast, Universal and synchronization-free Cross-Domain Attack on Speech Recognition

Sep 20, 2023
Mohamad Fakih, Rouwaida Kanj, Fadi Kurdahi, Mohammed E. Fouda

Viaarxiv icon

Active Learning Based Fine-Tuning Framework for Speech Emotion Recognition

Sep 30, 2023
Dongyuan Li, Yusong Wang, Kotaro Funakoshi, Manabu Okumura

Viaarxiv icon

Sparsely Shared LoRA on Whisper for Child Speech Recognition

Sep 21, 2023
Wei Liu, Ying Qin, Zhiyuan Peng, Tan Lee

Viaarxiv icon

End-to-end Joint Rich and Normalized ASR with a limited amount of rich training data

Nov 29, 2023
Can Cui, Imran Ahamad Sheikh, Mostafa Sadeghi, Emmanuel Vincent

Viaarxiv icon

A Quantitative Approach to Understand Self-Supervised Models as Cross-lingual Feature Extractors

Nov 27, 2023
Shuyue Stella Li, Beining Xu, Xiangyu Zhang, Hexin Liu, Wenhan Chao, Leibny Paola Garcia

Viaarxiv icon

Phonetic-aware speaker embedding for far-field speaker verification

Nov 27, 2023
Zezhong Jin, Youzhi Tu, Man-Wai Mak

Viaarxiv icon