speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

Teaching the Teachers: Boosting unsupervised domain adaptation in speech recognition by ensemble update

Add code
Apr 13, 2026
Viaarxiv icon

Interactive ASR: Towards Human-Like Interaction and Semantic Coherence Evaluation for Agentic Speech Recognition

Add code
Apr 14, 2026
Viaarxiv icon

BlasBench: An Open Benchmark for Irish Speech Recognition

Add code
Apr 12, 2026
Viaarxiv icon

Pushing the Limits of On-Device Streaming ASR: A Compact, High-Accuracy English Model for Low-Latency Inference

Add code
Apr 16, 2026
Viaarxiv icon

Empowering Video Translation using Multimodal Large Language Models

Add code
Apr 13, 2026
Viaarxiv icon

Cross-Cultural Bias in Mel-Scale Representations: Evidence and Alternatives from Speech and Music

Add code
Apr 12, 2026
Viaarxiv icon

Data Selection Effects on Self-Supervised Learning of Audio Representations for French Audiovisual Broadcasts

Add code
Apr 10, 2026
Viaarxiv icon

Semantic-Emotional Resonance Embedding: A Semi-Supervised Paradigm for Cross-Lingual Speech Emotion Recognition

Add code
Apr 08, 2026
Viaarxiv icon

Few-Shot Contrastive Adaptation for Audio Abuse Detection in Low-Resource Indic Languages

Add code
Apr 10, 2026
Viaarxiv icon

XR-CareerAssist: An Immersive Platform for Personalised Career Guidance Leveraging Extended Reality and Multimodal AI

Add code
Apr 08, 2026
Viaarxiv icon