speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

Stuttering-Aware Automatic Speech Recognition for Indonesian Language

Add code
Jan 07, 2026
Viaarxiv icon

QMAVIS: Long Video-Audio Understanding using Fusion of Large Multimodal Models

Add code
Jan 10, 2026
Viaarxiv icon

An Intelligent AI glasses System with Multi-Agent Architecture for Real-Time Voice Processing and Task Execution

Add code
Jan 09, 2026
Viaarxiv icon

Multi-channel multi-speaker transformer for speech recognition

Add code
Jan 06, 2026
Viaarxiv icon

Multimodal In-context Learning for ASR of Low-resource Languages

Add code
Jan 09, 2026
Viaarxiv icon

Linear Script Representations in Speech Foundation Models Enable Zero-Shot Transliteration

Add code
Jan 06, 2026
Viaarxiv icon

MORE: Multi-Objective Adversarial Attacks on Speech Recognition

Add code
Jan 05, 2026
Viaarxiv icon

TellWhisper: Tell Whisper Who Speaks When

Add code
Jan 08, 2026
Viaarxiv icon

Improving Code-Switching Speech Recognition with TTS Data Augmentation

Add code
Jan 02, 2026
Viaarxiv icon

Dynamic Quantization Error Propagation in Encoder-Decoder ASR Quantization

Add code
Jan 05, 2026
Viaarxiv icon