speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

Navigating the Reality Gap: Privacy-Preserving Adaptation of ASR for Challenging Low-Resource Domains

Add code
Dec 22, 2025
Figure 1 for Navigating the Reality Gap: Privacy-Preserving Adaptation of ASR for Challenging Low-Resource Domains
Figure 2 for Navigating the Reality Gap: Privacy-Preserving Adaptation of ASR for Challenging Low-Resource Domains
Figure 3 for Navigating the Reality Gap: Privacy-Preserving Adaptation of ASR for Challenging Low-Resource Domains
Figure 4 for Navigating the Reality Gap: Privacy-Preserving Adaptation of ASR for Challenging Low-Resource Domains
Viaarxiv icon

Zero-Shot Recognition of Dysarthric Speech Using Commercial Automatic Speech Recognition and Multimodal Large Language Models

Add code
Dec 19, 2025
Figure 1 for Zero-Shot Recognition of Dysarthric Speech Using Commercial Automatic Speech Recognition and Multimodal Large Language Models
Figure 2 for Zero-Shot Recognition of Dysarthric Speech Using Commercial Automatic Speech Recognition and Multimodal Large Language Models
Figure 3 for Zero-Shot Recognition of Dysarthric Speech Using Commercial Automatic Speech Recognition and Multimodal Large Language Models
Figure 4 for Zero-Shot Recognition of Dysarthric Speech Using Commercial Automatic Speech Recognition and Multimodal Large Language Models
Viaarxiv icon

Incorporating Error Level Noise Embedding for Improving LLM-Assisted Robustness in Persian Speech Recognition

Add code
Dec 19, 2025
Viaarxiv icon

Peeking Into The Future For Contextual Biasing

Add code
Dec 19, 2025
Figure 1 for Peeking Into The Future For Contextual Biasing
Figure 2 for Peeking Into The Future For Contextual Biasing
Figure 3 for Peeking Into The Future For Contextual Biasing
Figure 4 for Peeking Into The Future For Contextual Biasing
Viaarxiv icon

Scalable Frameworks for Real-World Audio-Visual Speech Recognition

Add code
Dec 16, 2025
Figure 1 for Scalable Frameworks for Real-World Audio-Visual Speech Recognition
Figure 2 for Scalable Frameworks for Real-World Audio-Visual Speech Recognition
Figure 3 for Scalable Frameworks for Real-World Audio-Visual Speech Recognition
Figure 4 for Scalable Frameworks for Real-World Audio-Visual Speech Recognition
Viaarxiv icon

When De-noising Hurts: A Systematic Study of Speech Enhancement Effects on Modern Medical ASR Systems

Add code
Dec 19, 2025
Viaarxiv icon

Reproducing and Dissecting Denoising Language Models for Speech Recognition

Add code
Dec 15, 2025
Viaarxiv icon

Adaptive Edge-Cloud Inference for Speech-to-Action Systems Using ASR and Large Language Models

Add code
Dec 18, 2025
Figure 1 for Adaptive Edge-Cloud Inference for Speech-to-Action Systems Using ASR and Large Language Models
Figure 2 for Adaptive Edge-Cloud Inference for Speech-to-Action Systems Using ASR and Large Language Models
Figure 3 for Adaptive Edge-Cloud Inference for Speech-to-Action Systems Using ASR and Large Language Models
Figure 4 for Adaptive Edge-Cloud Inference for Speech-to-Action Systems Using ASR and Large Language Models
Viaarxiv icon

A stylometric analysis of speaker attribution from speech transcripts

Add code
Dec 18, 2025
Viaarxiv icon

GeoSense-AI: Fast Location Inference from Crisis Microblogs

Add code
Dec 20, 2025
Figure 1 for GeoSense-AI: Fast Location Inference from Crisis Microblogs
Figure 2 for GeoSense-AI: Fast Location Inference from Crisis Microblogs
Figure 3 for GeoSense-AI: Fast Location Inference from Crisis Microblogs
Figure 4 for GeoSense-AI: Fast Location Inference from Crisis Microblogs
Viaarxiv icon