speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

Can Emotion Fool Anti-spoofing?

Add code
May 29, 2025
Viaarxiv icon

ZIPA: A family of efficient models for multilingual phone recognition

Add code
May 29, 2025
Viaarxiv icon

A Comparative Evaluation of Deep Learning Models for Speech Enhancement in Real-World Noisy Environments

Add code
Jun 17, 2025
Viaarxiv icon

From Flat to Feeling: A Feasibility and Impact Study on Dynamic Facial Emotions in AI-Generated Avatars

Add code
Jun 16, 2025
Viaarxiv icon

Speech-to-Text Translation with Phoneme-Augmented CoT: Enhancing Cross-Lingual Transfer in Low-Resource Scenarios

Add code
May 30, 2025
Viaarxiv icon

Towards disentangling the contributions of articulation and acoustics in multimodal phoneme recognition

Add code
May 29, 2025
Viaarxiv icon

Fine-Tuning Video Transformers for Word-Level Bangla Sign Language: A Comparative Analysis for Classification Tasks

Add code
Jun 04, 2025
Viaarxiv icon

Robust Unsupervised Adaptation of a Speech Recogniser Using Entropy Minimisation and Speaker Codes

Add code
Jun 12, 2025
Viaarxiv icon

FeatureSense: Protecting Speaker Attributes in Always-On Audio Sensing System

Add code
May 30, 2025
Viaarxiv icon

Pretraining Multi-Speaker Identification for Neural Speaker Diarization

Add code
May 30, 2025
Viaarxiv icon