speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

NonverbalTTS: A Public English Corpus of Text-Aligned Nonverbal Vocalizations with Emotion Annotations for Text-to-Speech

Add code
Jul 17, 2025
Viaarxiv icon

Edge-ASR: Towards Low-Bit Quantization of Automatic Speech Recognition Models

Add code
Jul 10, 2025
Viaarxiv icon

Improving Contextual ASR via Multi-grained Fusion with Large Language Models

Add code
Jul 16, 2025
Viaarxiv icon

Reading Between the Lines: Combining Pause Dynamics and Semantic Coherence for Automated Assessment of Thought Disorder

Add code
Jul 17, 2025
Viaarxiv icon

VisualSpeaker: Visually-Guided 3D Avatar Lip Synthesis

Add code
Jul 08, 2025
Viaarxiv icon

Privacy Disclosure of Similarity in Speech and Language Processing

Add code
Aug 07, 2025
Viaarxiv icon

End-to-end Acoustic-linguistic Emotion and Intent Recognition Enhanced by Semi-supervised Learning

Add code
Jul 10, 2025
Viaarxiv icon

Adaptability of ASR Models on Low-Resource Language: A Comparative Study of Whisper and Wav2Vec-BERT on Bangla

Add code
Jul 02, 2025
Viaarxiv icon

Touch Speaks, Sound Feels: A Multimodal Approach to Affective and Social Touch from Robots to Humans

Add code
Aug 11, 2025
Viaarxiv icon

A Cookbook for Community-driven Data Collection of Impaired Speech in LowResource Languages

Add code
Jul 03, 2025
Viaarxiv icon