speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

EvolveCaptions: Empowering DHH Users Through Real-Time Collaborative Captioning

Add code
Oct 02, 2025
Viaarxiv icon

Interpreting the Role of Visemes in Audio-Visual Speech Recognition

Add code
Sep 19, 2025
Viaarxiv icon

UMA-Split: unimodal aggregation for both English and Mandarin non-autoregressive speech recognition

Add code
Sep 18, 2025
Figure 1 for UMA-Split: unimodal aggregation for both English and Mandarin non-autoregressive speech recognition
Figure 2 for UMA-Split: unimodal aggregation for both English and Mandarin non-autoregressive speech recognition
Figure 3 for UMA-Split: unimodal aggregation for both English and Mandarin non-autoregressive speech recognition
Figure 4 for UMA-Split: unimodal aggregation for both English and Mandarin non-autoregressive speech recognition
Viaarxiv icon

Multi-Channel Differential ASR for Robust Wearer Speech Recognition on Smart Glasses

Add code
Sep 17, 2025
Viaarxiv icon

State-of-the-Art Dysarthric Speech Recognition with MetaICL for on-the-fly Personalization

Add code
Sep 19, 2025
Viaarxiv icon

Thinking in cocktail party: Chain-of-Thought and reinforcement learning for target speaker automatic speech recognition

Add code
Sep 19, 2025
Viaarxiv icon

EmoQ: Speech Emotion Recognition via Speech-Aware Q-Former and Large Language Model

Add code
Sep 19, 2025
Viaarxiv icon

Layer-wise Minimal Pair Probing Reveals Contextual Grammatical-Conceptual Hierarchy in Speech Representations

Add code
Sep 19, 2025
Viaarxiv icon

A Parallel Ultra-Low Power Silent Speech Interface based on a Wearable, Fully-dry EMG Neckband

Add code
Sep 26, 2025
Figure 1 for A Parallel Ultra-Low Power Silent Speech Interface based on a Wearable, Fully-dry EMG Neckband
Figure 2 for A Parallel Ultra-Low Power Silent Speech Interface based on a Wearable, Fully-dry EMG Neckband
Figure 3 for A Parallel Ultra-Low Power Silent Speech Interface based on a Wearable, Fully-dry EMG Neckband
Figure 4 for A Parallel Ultra-Low Power Silent Speech Interface based on a Wearable, Fully-dry EMG Neckband
Viaarxiv icon

GLip: A Global-Local Integrated Progressive Framework for Robust Visual Speech Recognition

Add code
Sep 19, 2025
Viaarxiv icon