speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

CLiFT-ASR: A Cross-Lingual Fine-Tuning Framework for Low-Resource Taiwanese Hokkien Speech Recognition

Add code
Nov 10, 2025
Viaarxiv icon

MERaLiON-SER: Robust Speech Emotion Recognition Model for English and SEA Languages

Add code
Nov 12, 2025
Viaarxiv icon

Speech Emotion Recognition with Phonation Excitation Information and Articulatory Kinematics

Add code
Nov 11, 2025
Viaarxiv icon

Ground Truth Generation for Multilingual Historical NLP using LLMs

Add code
Nov 18, 2025
Figure 1 for Ground Truth Generation for Multilingual Historical NLP using LLMs
Figure 2 for Ground Truth Generation for Multilingual Historical NLP using LLMs
Figure 3 for Ground Truth Generation for Multilingual Historical NLP using LLMs
Figure 4 for Ground Truth Generation for Multilingual Historical NLP using LLMs
Viaarxiv icon

Tighter Truncated Rectangular Prism Approximation for RNN Robustness Verification

Add code
Nov 12, 2025
Viaarxiv icon

End-to-end Contrastive Language-Speech Pretraining Model For Long-form Spoken Question Answering

Add code
Nov 12, 2025
Viaarxiv icon

WST: Weakly Supervised Transducer for Automatic Speech Recognition

Add code
Nov 06, 2025
Viaarxiv icon

Quantizing Whisper-small: How design choices affect ASR performance

Add code
Nov 11, 2025
Viaarxiv icon

Surgical Agent Orchestration Platform for Voice-directed Patient Data Interaction

Add code
Nov 11, 2025
Viaarxiv icon

E2E-VGuard: Adversarial Prevention for Production LLM-based End-To-End Speech Synthesis

Add code
Nov 10, 2025
Viaarxiv icon