speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

An Intelligent AI glasses System with Multi-Agent Architecture for Real-Time Voice Processing and Task Execution

Add code
Jan 09, 2026
Viaarxiv icon

Linear Script Representations in Speech Foundation Models Enable Zero-Shot Transliteration

Add code
Jan 06, 2026
Viaarxiv icon

TidyVoice: A Curated Multilingual Dataset for Speaker Verification Derived from Common Voice

Add code
Jan 22, 2026
Viaarxiv icon

HoverAI: An Embodied Aerial Agent for Natural Human-Drone Interaction

Add code
Jan 20, 2026
Viaarxiv icon

MORE: Multi-Objective Adversarial Attacks on Speech Recognition

Add code
Jan 05, 2026
Viaarxiv icon

QMAVIS: Long Video-Audio Understanding using Fusion of Large Multimodal Models

Add code
Jan 10, 2026
Viaarxiv icon

Improving Code-Switching Speech Recognition with TTS Data Augmentation

Add code
Jan 02, 2026
Viaarxiv icon

Multimodal In-context Learning for ASR of Low-resource Languages

Add code
Jan 09, 2026
Viaarxiv icon

TellWhisper: Tell Whisper Who Speaks When

Add code
Jan 08, 2026
Viaarxiv icon

IKFST: IOO and KOO Algorithms for Accelerated and Precise WFST-based End-to-End Automatic Speech Recognition

Add code
Jan 01, 2026
Viaarxiv icon