speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

Task Arithmetic with Support Languages for Low-Resource ASR

Add code
Jan 11, 2026
Viaarxiv icon

Multi-channel multi-speaker transformer for speech recognition

Add code
Jan 06, 2026
Viaarxiv icon

Variational decomposition autoencoding improves disentanglement of latent representations

Add code
Jan 11, 2026
Viaarxiv icon

An Intelligent AI glasses System with Multi-Agent Architecture for Real-Time Voice Processing and Task Execution

Add code
Jan 09, 2026
Viaarxiv icon

QMAVIS: Long Video-Audio Understanding using Fusion of Large Multimodal Models

Add code
Jan 10, 2026
Viaarxiv icon

Linear Script Representations in Speech Foundation Models Enable Zero-Shot Transliteration

Add code
Jan 06, 2026
Viaarxiv icon

MORE: Multi-Objective Adversarial Attacks on Speech Recognition

Add code
Jan 05, 2026
Viaarxiv icon

Multimodal In-context Learning for ASR of Low-resource Languages

Add code
Jan 09, 2026
Viaarxiv icon

Improving Code-Switching Speech Recognition with TTS Data Augmentation

Add code
Jan 02, 2026
Viaarxiv icon

TellWhisper: Tell Whisper Who Speaks When

Add code
Jan 08, 2026
Viaarxiv icon