speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

ProSarc: Prosody-Aware Sarcasm Recognition Framework via Temporal Prosodic Incongruity

Add code
Jun 04, 2026
Viaarxiv icon

Towards Human-Like Interactive Speech Recognition With Agentic Correction and Semantic Evaluation

Add code
May 28, 2026
Viaarxiv icon

SpeakerCard-1M: An Evidence-Grounded Speaker Card Corpus for In-the-Wild Speaker Verification

Add code
Jun 03, 2026
Viaarxiv icon

Your Multimodal Speech Model Says I Have a Face for Radio

Add code
May 28, 2026
Viaarxiv icon

Scaling Conversational Hungarian ASR: The BEA-Dialogue+ Corpus

Add code
May 29, 2026
Viaarxiv icon

Syllabic-Structure Decoder for Automatic Speech Recognition in Vietnamese

Add code
May 27, 2026
Viaarxiv icon

Data-Efficient On-Policy Distillation for Automatic Speech Recognition

Add code
May 27, 2026
Viaarxiv icon

Diffusion Large Language Models for Visual Speech Recognition

Add code
May 27, 2026
Viaarxiv icon

TokTalk: Expressive Real-time Facial Animation from Audio-LLM Tokens

Add code
May 29, 2026
Viaarxiv icon

Deep Binarized Photonic Reservoir Computing for Ultrafast Multimedia Signal Processing

Add code
May 28, 2026
Viaarxiv icon