speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

Scalable Frameworks for Real-World Audio-Visual Speech Recognition

Add code
Dec 16, 2025
Figure 1 for Scalable Frameworks for Real-World Audio-Visual Speech Recognition
Figure 2 for Scalable Frameworks for Real-World Audio-Visual Speech Recognition
Figure 3 for Scalable Frameworks for Real-World Audio-Visual Speech Recognition
Figure 4 for Scalable Frameworks for Real-World Audio-Visual Speech Recognition
Viaarxiv icon

When De-noising Hurts: A Systematic Study of Speech Enhancement Effects on Modern Medical ASR Systems

Add code
Dec 19, 2025
Viaarxiv icon

Reproducing and Dissecting Denoising Language Models for Speech Recognition

Add code
Dec 15, 2025
Viaarxiv icon

Adaptive Edge-Cloud Inference for Speech-to-Action Systems Using ASR and Large Language Models

Add code
Dec 18, 2025
Figure 1 for Adaptive Edge-Cloud Inference for Speech-to-Action Systems Using ASR and Large Language Models
Figure 2 for Adaptive Edge-Cloud Inference for Speech-to-Action Systems Using ASR and Large Language Models
Figure 3 for Adaptive Edge-Cloud Inference for Speech-to-Action Systems Using ASR and Large Language Models
Figure 4 for Adaptive Edge-Cloud Inference for Speech-to-Action Systems Using ASR and Large Language Models
Viaarxiv icon

A stylometric analysis of speaker attribution from speech transcripts

Add code
Dec 18, 2025
Viaarxiv icon

EEG-to-Voice Decoding of Spoken and Imagined speech Using Non-Invasive EEG

Add code
Dec 14, 2025
Viaarxiv icon

All-in-One ASR: Unifying Encoder-Decoder Models of CTC, Attention, and Transducer in Dual-Mode ASR

Add code
Dec 12, 2025
Viaarxiv icon

TRIDENT: A Redundant Architecture for Caribbean-Accented Emergency Speech Triage

Add code
Dec 11, 2025
Viaarxiv icon

GeoSense-AI: Fast Location Inference from Crisis Microblogs

Add code
Dec 20, 2025
Figure 1 for GeoSense-AI: Fast Location Inference from Crisis Microblogs
Figure 2 for GeoSense-AI: Fast Location Inference from Crisis Microblogs
Figure 3 for GeoSense-AI: Fast Location Inference from Crisis Microblogs
Figure 4 for GeoSense-AI: Fast Location Inference from Crisis Microblogs
Viaarxiv icon

Robust Speech Activity Detection in the Presence of Singing Voice

Add code
Dec 10, 2025
Figure 1 for Robust Speech Activity Detection in the Presence of Singing Voice
Figure 2 for Robust Speech Activity Detection in the Presence of Singing Voice
Figure 3 for Robust Speech Activity Detection in the Presence of Singing Voice
Figure 4 for Robust Speech Activity Detection in the Presence of Singing Voice
Viaarxiv icon