speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

Semantics-Aware Generative Latent Data Augmentation for Learning in Low-Resource Domains

Add code
Feb 02, 2026
Viaarxiv icon

OCR-Enhanced Multimodal ASR Can Read While Listening

Add code
Jan 26, 2026
Viaarxiv icon

CALM: Joint Contextual Acoustic-Linguistic Modeling for Personalization of Multi-Speaker ASR

Add code
Jan 30, 2026
Viaarxiv icon

BanglaRobustNet: A Hybrid Denoising-Attention Architecture for Robust Bangla Speech Recognition

Add code
Jan 25, 2026
Viaarxiv icon

Enhancing Speech Emotion Recognition using Dynamic Spectral Features and Kalman Smoothing

Add code
Jan 26, 2026
Viaarxiv icon

dLLM-ASR: A Faster Diffusion LLM-based Framework for Speech Recognition

Add code
Jan 25, 2026
Viaarxiv icon

VIBEVOICE-ASR Technical Report

Add code
Jan 26, 2026
Viaarxiv icon

Text-only adaptation in LLM-based ASR through text denoising

Add code
Jan 28, 2026
Viaarxiv icon

A Study of Data Selection Strategies for Pre-training Self-Supervised Speech Models

Add code
Jan 28, 2026
Viaarxiv icon

Qwen3-ASR Technical Report

Add code
Jan 29, 2026
Viaarxiv icon