speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

Dynamic Multi-Expert Projectors with Stabilized Routing for Multilingual Speech Recognition

Add code
Jan 27, 2026
Viaarxiv icon

A Study of Data Selection Strategies for Pre-training Self-Supervised Speech Models

Add code
Jan 28, 2026
Viaarxiv icon

Mind the Shift: Using Delta SSL Embeddings to Enhance Child ASR

Add code
Jan 28, 2026
Viaarxiv icon

OCR-Enhanced Multimodal ASR Can Read While Listening

Add code
Jan 26, 2026
Viaarxiv icon

A Baseline Multimodal Approach to Emotion Recognition in Conversations

Add code
Jan 31, 2026
Viaarxiv icon

Enhancing Speech Emotion Recognition using Dynamic Spectral Features and Kalman Smoothing

Add code
Jan 26, 2026
Viaarxiv icon

SLM-SS: Speech Language Model for Generative Speech Separation

Add code
Jan 27, 2026
Viaarxiv icon

VIBEVOICE-ASR Technical Report

Add code
Jan 26, 2026
Viaarxiv icon

Distillation-based Layer Dropping (DLD): Effective End-to-end Framework for Dynamic Speech Networks

Add code
Jan 27, 2026
Viaarxiv icon

SE-DiCoW: Self-Enrolled Diarization-Conditioned Whisper

Add code
Jan 27, 2026
Viaarxiv icon