speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

When AVSR Meets Video Conferencing: Dataset, Degradation, and the Hidden Mechanism Behind Performance Collapse

Add code
Mar 24, 2026
Viaarxiv icon

When AI Meets Early Childhood Education: Large Language Models as Assessment Teammates in Chinese Preschools

Add code
Mar 25, 2026
Viaarxiv icon

From Content to Audience: A Multimodal Annotation Framework for Broadcast Television Analytics

Add code
Mar 24, 2026
Viaarxiv icon

Evaluating a Multi-Agent Voice-Enabled Smart Speaker for Care Homes: A Safety-Focused Framework

Add code
Mar 24, 2026
Viaarxiv icon

MSP-Conversation: A Corpus for Naturalistic, Time-Continuous Emotion Recognition

Add code
Mar 23, 2026
Viaarxiv icon

Evaluating Interactive 2D Visualization as a Sample Selection Strategy for Biomedical Time-Series Data Annotation

Add code
Mar 27, 2026
Viaarxiv icon

Contextual Biasing for ASR in Speech LLM with Common Word Cues and Bias Word Position Prediction

Add code
Apr 14, 2026
Viaarxiv icon

An Empirical Recipe for Universal Phone Recognition

Add code
Mar 30, 2026
Viaarxiv icon

TASU2: Controllable CTC Simulation for Alignment and Low-Resource Adaptation of Speech LLMs

Add code
Apr 09, 2026
Viaarxiv icon

DiscoPhon: Benchmarking the Unsupervised Discovery of Phoneme Inventories With Discrete Speech Units

Add code
Mar 19, 2026
Viaarxiv icon