Alert button

"speech recognition": models, code, and papers
Alert button

SECap: Speech Emotion Captioning with Large Language Model

Dec 23, 2023
Yaoxun Xu, Hangting Chen, Jianwei Yu, Qiaochu Huang, Zhiyong Wu, Shixiong Zhang, Guangzhi Li, Yi Luo, Rongzhi Gu

Viaarxiv icon

Kid-Whisper: Towards Bridging the Performance Gap in Automatic Speech Recognition for Children VS. Adults

Sep 12, 2023
Ahmed Adel Attia, Jing Liu, Wei Ai, Dorottya Demszky, Carol Espy-Wilson

Figure 1 for Kid-Whisper: Towards Bridging the Performance Gap in Automatic Speech Recognition for Children VS. Adults
Figure 2 for Kid-Whisper: Towards Bridging the Performance Gap in Automatic Speech Recognition for Children VS. Adults
Figure 3 for Kid-Whisper: Towards Bridging the Performance Gap in Automatic Speech Recognition for Children VS. Adults
Figure 4 for Kid-Whisper: Towards Bridging the Performance Gap in Automatic Speech Recognition for Children VS. Adults
Viaarxiv icon

Unimodal Aggregation for CTC-based Speech Recognition

Sep 15, 2023
Ying Fang, Xiaofei Li

Figure 1 for Unimodal Aggregation for CTC-based Speech Recognition
Figure 2 for Unimodal Aggregation for CTC-based Speech Recognition
Figure 3 for Unimodal Aggregation for CTC-based Speech Recognition
Figure 4 for Unimodal Aggregation for CTC-based Speech Recognition
Viaarxiv icon

BA-MoE: Boundary-Aware Mixture-of-Experts Adapter for Code-Switching Speech Recognition

Oct 08, 2023
Peikun Chen, Fan Yu, Yuhao Lian, Hongfei Xue, Xucheng Wan, Naijun Zheng, Huan Zhou, Lei Xie

Figure 1 for BA-MoE: Boundary-Aware Mixture-of-Experts Adapter for Code-Switching Speech Recognition
Figure 2 for BA-MoE: Boundary-Aware Mixture-of-Experts Adapter for Code-Switching Speech Recognition
Figure 3 for BA-MoE: Boundary-Aware Mixture-of-Experts Adapter for Code-Switching Speech Recognition
Figure 4 for BA-MoE: Boundary-Aware Mixture-of-Experts Adapter for Code-Switching Speech Recognition
Viaarxiv icon

LIP-RTVE: An Audiovisual Database for Continuous Spanish in the Wild

Nov 21, 2023
David Gimeno-Gómez, Carlos-D. Martínez-Hinarejos

Viaarxiv icon

No Pitch Left Behind: Addressing Gender Unbalance in Automatic Speech Recognition through Pitch Manipulation

Oct 10, 2023
Dennis Fucci, Marco Gaido, Matteo Negri, Mauro Cettolo, Luisa Bentivogli

Figure 1 for No Pitch Left Behind: Addressing Gender Unbalance in Automatic Speech Recognition through Pitch Manipulation
Figure 2 for No Pitch Left Behind: Addressing Gender Unbalance in Automatic Speech Recognition through Pitch Manipulation
Figure 3 for No Pitch Left Behind: Addressing Gender Unbalance in Automatic Speech Recognition through Pitch Manipulation
Figure 4 for No Pitch Left Behind: Addressing Gender Unbalance in Automatic Speech Recognition through Pitch Manipulation
Viaarxiv icon

On the Relevance of Phoneme Duration Variability of Synthesized Training Data for Automatic Speech Recognition

Oct 12, 2023
Nick Rossenbach, Benedikt Hilmes, Ralf Schlüter

Figure 1 for On the Relevance of Phoneme Duration Variability of Synthesized Training Data for Automatic Speech Recognition
Figure 2 for On the Relevance of Phoneme Duration Variability of Synthesized Training Data for Automatic Speech Recognition
Figure 3 for On the Relevance of Phoneme Duration Variability of Synthesized Training Data for Automatic Speech Recognition
Figure 4 for On the Relevance of Phoneme Duration Variability of Synthesized Training Data for Automatic Speech Recognition
Viaarxiv icon

KinSPEAK: Improving speech recognition for Kinyarwanda via semi-supervised learning methods

Aug 23, 2023
Antoine Nzeyimana

Figure 1 for KinSPEAK: Improving speech recognition for Kinyarwanda via semi-supervised learning methods
Figure 2 for KinSPEAK: Improving speech recognition for Kinyarwanda via semi-supervised learning methods
Figure 3 for KinSPEAK: Improving speech recognition for Kinyarwanda via semi-supervised learning methods
Figure 4 for KinSPEAK: Improving speech recognition for Kinyarwanda via semi-supervised learning methods
Viaarxiv icon

Creating Spoken Dialog Systems in Ultra-Low Resourced Settings

Dec 11, 2023
Moayad Elamin, Muhammad Omer, Yonas Chanie, Henslaac Ndlovu

Viaarxiv icon