speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

From Oracle to Noisy Context: Mitigating Contextual Exposure Bias in Speech-LLMs

Add code
Mar 25, 2026
Viaarxiv icon

Cascade-Free Mandarin Visual Speech Recognition via Semantic-Guided Cross-Representation Alignment

Add code
Mar 23, 2026
Viaarxiv icon

Crab: Multi Layer Contrastive Supervision to Improve Speech Emotion Recognition Under Both Acted and Natural Speech Condition

Add code
Mar 24, 2026
Viaarxiv icon

Evaluating Interactive 2D Visualization as a Sample Selection Strategy for Biomedical Time-Series Data Annotation

Add code
Mar 27, 2026
Viaarxiv icon

Who Spoke What When? Evaluating Spoken Language Models for Conversational ASR with Semantic and Overlap-Aware Metrics

Add code
Mar 24, 2026
Viaarxiv icon

Bridging Biological Hearing and Neuromorphic Computing: End-to-End Time-Domain Audio Signal Processing with Reservoir Computing

Add code
Mar 25, 2026
Viaarxiv icon

MSR-HuBERT: Self-supervised Pre-training for Adaptation to Multiple Sampling Rates

Add code
Mar 24, 2026
Viaarxiv icon

When AI Meets Early Childhood Education: Large Language Models as Assessment Teammates in Chinese Preschools

Add code
Mar 25, 2026
Viaarxiv icon

Precision-Varying Prediction (PVP): Robustifying ASR systems against adversarial attacks

Add code
Mar 23, 2026
Viaarxiv icon

When AVSR Meets Video Conferencing: Dataset, Degradation, and the Hidden Mechanism Behind Performance Collapse

Add code
Mar 24, 2026
Viaarxiv icon