speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

Towards Human-Like Interactive Speech Recognition With Agentic Correction and Semantic Evaluation

Add code
May 28, 2026
Viaarxiv icon

Deep Binarized Photonic Reservoir Computing for Ultrafast Multimedia Signal Processing

Add code
May 28, 2026
Viaarxiv icon

Decoding Strategies for Diffusion-Based ASR: A Systematic Evaluation of Confidence-Based Thresholding

Add code
May 28, 2026
Viaarxiv icon

MMTM: Tri-Modal Topic Modeling for Long-Form Video via Similarity-Gated Fusion

Add code
May 28, 2026
Viaarxiv icon

Decentralized LLM-Driven Coordination of Acoustic Robots for Contactless Object Manipulation

Add code
May 28, 2026
Viaarxiv icon

HoliTok:A Coutinuous Holistic Tokenization with Robust Dual Capabilities of Speech Generation and Understanding

Add code
May 28, 2026
Viaarxiv icon

Syllabic-Structure Decoder for Automatic Speech Recognition in Vietnamese

Add code
May 27, 2026
Viaarxiv icon

Data-Efficient On-Policy Distillation for Automatic Speech Recognition

Add code
May 27, 2026
Viaarxiv icon

Diffusion Large Language Models for Visual Speech Recognition

Add code
May 27, 2026
Viaarxiv icon

TARQ: Tail-Aware Reconstruction Quantization for Rare-Word Robust Automatic Speech Recognition

Add code
May 27, 2026
Viaarxiv icon