speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

Bloodroot: When Watermarking Turns Poisonous For Stealthy Backdoor

Add code
Oct 09, 2025
Viaarxiv icon

Decoding Deception: Understanding Automatic Speech Recognition Vulnerabilities in Evasion and Poisoning Attacks

Add code
Sep 26, 2025
Viaarxiv icon

How I Built ASR for Endangered Languages with a Spoken Dictionary

Add code
Oct 06, 2025
Viaarxiv icon

Evaluating Self-Supervised Speech Models via Text-Based LLMS

Add code
Oct 06, 2025
Viaarxiv icon

How much speech data is necessary for ASR in African languages? An evaluation of data scaling in Kinyarwanda and Kikuyu

Add code
Oct 08, 2025
Viaarxiv icon

Decoding the Ear: A Framework for Objectifying Expressiveness from Human Preference Through Efficient Alignment

Add code
Oct 23, 2025
Viaarxiv icon

EvolveCaptions: Empowering DHH Users Through Real-Time Collaborative Captioning

Add code
Oct 02, 2025
Viaarxiv icon

Interpreting the Role of Visemes in Audio-Visual Speech Recognition

Add code
Sep 19, 2025
Viaarxiv icon

Multi-Channel Differential ASR for Robust Wearer Speech Recognition on Smart Glasses

Add code
Sep 17, 2025
Viaarxiv icon

UMA-Split: unimodal aggregation for both English and Mandarin non-autoregressive speech recognition

Add code
Sep 18, 2025
Figure 1 for UMA-Split: unimodal aggregation for both English and Mandarin non-autoregressive speech recognition
Figure 2 for UMA-Split: unimodal aggregation for both English and Mandarin non-autoregressive speech recognition
Figure 3 for UMA-Split: unimodal aggregation for both English and Mandarin non-autoregressive speech recognition
Figure 4 for UMA-Split: unimodal aggregation for both English and Mandarin non-autoregressive speech recognition
Viaarxiv icon