Alert button

"speech recognition": models, code, and papers
Alert button

An Effective Automated Speaking Assessment Approach to Mitigating Data Scarcity and Imbalanced Distribution

Apr 12, 2024
Tien-Hong Lo, Fu-An Chao, Tzu-I Wu, Yao-Ting Sung, Berlin Chen

Viaarxiv icon

Conformer-1: Robust ASR via Large-Scale Semisupervised Bootstrapping

Apr 12, 2024
Kevin Zhang, Luka Chkhetiani, Francis McCann Ramirez, Yash Khare, Andrea Vanzo, Michael Liang, Sergio Ramirez Martin, Gabriel Oexle, Ruben Bousbib, Taufiquzzaman Peyash, Michael Nguyen, Dillon Pulliam, Domenic Donato

Viaarxiv icon

Comparing Apples to Oranges: LLM-powered Multimodal Intention Prediction in an Object Categorization Task

Apr 12, 2024
Hassan Ali, Philipp Allgeuer, Stefan Wermter

Viaarxiv icon

Advanced Long-Content Speech Recognition With Factorized Neural Transducer

Mar 20, 2024
Xun Gong, Yu Wu, Jinyu Li, Shujie Liu, Rui Zhao, Xie Chen, Yanmin Qian

Figure 1 for Advanced Long-Content Speech Recognition With Factorized Neural Transducer
Figure 2 for Advanced Long-Content Speech Recognition With Factorized Neural Transducer
Figure 3 for Advanced Long-Content Speech Recognition With Factorized Neural Transducer
Figure 4 for Advanced Long-Content Speech Recognition With Factorized Neural Transducer
Viaarxiv icon

Inappropriate Pause Detection In Dysarthric Speech Using Large-Scale Speech Recognition

Feb 29, 2024
Jeehyun Lee, Yerin Choi, Tae-Jin Song, Myoung-Wan Koo

Viaarxiv icon

The evaluation of a code-switched Sepedi-English automatic speech recognition system

Mar 11, 2024
Amanda Phaladi, Thipe Modipa

Figure 1 for The evaluation of a code-switched Sepedi-English automatic speech recognition system
Figure 2 for The evaluation of a code-switched Sepedi-English automatic speech recognition system
Figure 3 for The evaluation of a code-switched Sepedi-English automatic speech recognition system
Figure 4 for The evaluation of a code-switched Sepedi-English automatic speech recognition system
Viaarxiv icon

AdaMER-CTC: Connectionist Temporal Classification with Adaptive Maximum Entropy Regularization for Automatic Speech Recognition

Mar 18, 2024
SooHwan Eom, Eunseop Yoon, Hee Suk Yoon, Chanwoo Kim, Mark Hasegawa-Johnson, Chang D. Yoo

Figure 1 for AdaMER-CTC: Connectionist Temporal Classification with Adaptive Maximum Entropy Regularization for Automatic Speech Recognition
Figure 2 for AdaMER-CTC: Connectionist Temporal Classification with Adaptive Maximum Entropy Regularization for Automatic Speech Recognition
Figure 3 for AdaMER-CTC: Connectionist Temporal Classification with Adaptive Maximum Entropy Regularization for Automatic Speech Recognition
Figure 4 for AdaMER-CTC: Connectionist Temporal Classification with Adaptive Maximum Entropy Regularization for Automatic Speech Recognition
Viaarxiv icon

Aligning Speech to Languages to Enhance Code-switching Speech Recognition

Mar 09, 2024
Hexin Liu, Xiangyu Zhang, Leibny Paola Garcia, Andy W. H. Khong, Eng Siong Chng, Shinji Watanabe

Figure 1 for Aligning Speech to Languages to Enhance Code-switching Speech Recognition
Figure 2 for Aligning Speech to Languages to Enhance Code-switching Speech Recognition
Figure 3 for Aligning Speech to Languages to Enhance Code-switching Speech Recognition
Figure 4 for Aligning Speech to Languages to Enhance Code-switching Speech Recognition
Viaarxiv icon

A New Benchmark for Evaluating Automatic Speech Recognition in the Arabic Call Domain

Mar 07, 2024
Qusai Abo Obaidah, Muhy Eddin Zater, Adnan Jaljuli, Ali Mahboub, Asma Hakouz, Bashar Alfrou, Yazan Estaitia

Figure 1 for A New Benchmark for Evaluating Automatic Speech Recognition in the Arabic Call Domain
Figure 2 for A New Benchmark for Evaluating Automatic Speech Recognition in the Arabic Call Domain
Figure 3 for A New Benchmark for Evaluating Automatic Speech Recognition in the Arabic Call Domain
Figure 4 for A New Benchmark for Evaluating Automatic Speech Recognition in the Arabic Call Domain
Viaarxiv icon

Skipformer: A Skip-and-Recover Strategy for Efficient Speech Recognition

Mar 13, 2024
Wenjing Zhu, Sining Sun, Changhao Shan, Peng Fan, Qing Yang

Figure 1 for Skipformer: A Skip-and-Recover Strategy for Efficient Speech Recognition
Figure 2 for Skipformer: A Skip-and-Recover Strategy for Efficient Speech Recognition
Figure 3 for Skipformer: A Skip-and-Recover Strategy for Efficient Speech Recognition
Figure 4 for Skipformer: A Skip-and-Recover Strategy for Efficient Speech Recognition
Viaarxiv icon