speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

TASU2: Controllable CTC Simulation for Alignment and Low-Resource Adaptation of Speech LLMs

Add code
Apr 09, 2026
Viaarxiv icon

Benchmarking Multilingual Speech Models on Pashto: Zero-Shot ASR, Script Failure, and Cross-Domain Evaluation

Add code
Apr 06, 2026
Viaarxiv icon

Measuring Robustness of Speech Recognition from MEG Signals Under Distribution Shift

Add code
Apr 05, 2026
Viaarxiv icon

Speaker-Reasoner: Scaling Interaction Turns and Reasoning Patterns for Timestamped Speaker-Attributed ASR

Add code
Apr 03, 2026
Viaarxiv icon

Development and multi-center evaluation of domain-adapted speech recognition for human-AI teaming in real-world gastrointestinal endoscopy

Add code
Apr 02, 2026
Viaarxiv icon

Human-Guided Reasoning with Large Language Models for Vietnamese Speech Emotion Recognition

Add code
Apr 02, 2026
Viaarxiv icon

CV-18 NER: Augmented Common Voice for Named Entity Recognition from Arabic Speech

Add code
Apr 02, 2026
Viaarxiv icon

Adapting Text LLMs to Speech via Multimodal Depth Up-Scaling

Add code
Apr 01, 2026
Viaarxiv icon

Speech LLMs are Contextual Reasoning Transcribers

Add code
Apr 01, 2026
Viaarxiv icon

Advancing LLM-based phoneme-to-grapheme for multilingual speech recognition

Add code
Mar 31, 2026
Viaarxiv icon