speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

SamaVaani: Auditing and Debiasing Multilingual Clinical ASR for Indian Languages

Add code
Jun 25, 2026
Viaarxiv icon

Does Translation-Enhanced Speech Encoder Pre-training Affect Speech LLMs?

Add code
Jun 24, 2026
Viaarxiv icon

Dziri Voicebot: An End-to-End Low-Resource Speech-to-Speech Conversational System for Algerian Dialect

Add code
Jun 24, 2026
Viaarxiv icon

SpeechEQ: Benchmarking Emotional Intelligence Quotient in Socially Aware Voice Conversational Models

Add code
Jun 24, 2026
Viaarxiv icon

Autoencoder based optimized SSL representations: Complexity Minimization and improved Dysarthric ASR

Add code
Jun 23, 2026
Viaarxiv icon

Audio--Image Alignment as a Continued-Pretraining Stage Improves Low-Resource ASR

Add code
Jun 23, 2026
Viaarxiv icon

Comparative Reasoning: Making an Audio Language Model Better at Comparing Emotions

Add code
Jun 23, 2026
Viaarxiv icon

Data Scale, Not Latency, Shapes Cross-Lingual Encoder Transfer in Streaming ASR

Add code
Jun 23, 2026
Viaarxiv icon

VieSpeaker: A Large-Scale Vietnamese Speaker Recognition Dataset Beyond Visual Dependency

Add code
Jun 23, 2026
Viaarxiv icon

EmotionAI: A Privacy-Preserving Computational Intelligence Pipeline for Speech-Emotion-Grounded Conversational Analysis

Add code
Jun 22, 2026
Viaarxiv icon