speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

WenetSpeech-Wu: Datasets, Benchmarks, and Models for a Unified Chinese Wu Dialect Speech Processing Ecosystem

Add code
Jan 16, 2026
Viaarxiv icon

Motion-to-Response Content Generation via Multi-Agent AI System with Real-Time Safety Verification

Add code
Jan 20, 2026
Viaarxiv icon

Beyond Mapping : Domain-Invariant Representations via Spectral Embedding of Optimal Transport Plans

Add code
Jan 19, 2026
Viaarxiv icon

AI-based System for Transforming text and sound to Educational Videos

Add code
Jan 16, 2026
Viaarxiv icon

Categorize Early, Integrate Late: Divergent Processing Strategies in Automatic Speech Recognition

Add code
Jan 11, 2026
Viaarxiv icon

RLBR: Reinforcement Learning with Biasing Rewards for Contextual Speech Large Language Models

Add code
Jan 19, 2026
Viaarxiv icon

MCGA: A Multi-task Classical Chinese Literary Genre Audio Corpus

Add code
Jan 14, 2026
Viaarxiv icon

Doing More with Less: Data Augmentation for Sudanese Dialect Automatic Speech Recognition

Add code
Jan 11, 2026
Viaarxiv icon

Latent-Level Enhancement with Flow Matching for Robust Automatic Speech Recognition

Add code
Jan 08, 2026
Viaarxiv icon

Towards Comprehensive Semantic Speech Embeddings for Chinese Dialects

Add code
Jan 12, 2026
Viaarxiv icon