speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

CTC-DID: CTC-Based Arabic dialect identification for streaming applications

Add code
Jan 18, 2026
Viaarxiv icon

WenetSpeech-Wu: Datasets, Benchmarks, and Models for a Unified Chinese Wu Dialect Speech Processing Ecosystem

Add code
Jan 16, 2026
Viaarxiv icon

Motion-to-Response Content Generation via Multi-Agent AI System with Real-Time Safety Verification

Add code
Jan 20, 2026
Viaarxiv icon

Beyond Mapping : Domain-Invariant Representations via Spectral Embedding of Optimal Transport Plans

Add code
Jan 19, 2026
Viaarxiv icon

AI-based System for Transforming text and sound to Educational Videos

Add code
Jan 16, 2026
Viaarxiv icon

Categorize Early, Integrate Late: Divergent Processing Strategies in Automatic Speech Recognition

Add code
Jan 11, 2026
Viaarxiv icon

RLBR: Reinforcement Learning with Biasing Rewards for Contextual Speech Large Language Models

Add code
Jan 19, 2026
Viaarxiv icon

MCGA: A Multi-task Classical Chinese Literary Genre Audio Corpus

Add code
Jan 14, 2026
Viaarxiv icon

Doing More with Less: Data Augmentation for Sudanese Dialect Automatic Speech Recognition

Add code
Jan 11, 2026
Viaarxiv icon

Latent-Level Enhancement with Flow Matching for Robust Automatic Speech Recognition

Add code
Jan 08, 2026
Viaarxiv icon