speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

Cloning a Conversational Voice AI Agent from Call\,Recording Datasets for Telesales

Add code
Sep 05, 2025
Viaarxiv icon

Contextualized Token Discrimination for Speech Search Query Correction

Add code
Sep 04, 2025
Viaarxiv icon

PARCO: Phoneme-Augmented Robust Contextual ASR via Contrastive Entity Disambiguation

Add code
Sep 04, 2025
Viaarxiv icon

Spoken in Jest, Detected in Earnest: A Systematic Review of Sarcasm Recognition -- Multimodal Fusion, Challenges, and Future Prospects

Add code
Sep 04, 2025
Viaarxiv icon

LatPhon: Lightweight Multilingual G2P for Romance Languages and English

Add code
Sep 03, 2025
Viaarxiv icon

NE-PADD: Leveraging Named Entity Knowledge for Robust Partial Audio Deepfake Detection via Attention Aggregation

Add code
Sep 04, 2025
Viaarxiv icon

Towards Improved Speech Recognition through Optimized Synthetic Data Generation

Add code
Aug 29, 2025
Viaarxiv icon

Benchmarking Large Pretrained Multilingual Models on Québec French Speech Recognition

Add code
Aug 28, 2025
Viaarxiv icon

OLMoASR: Open Models and Data for Training Robust Speech Recognition Models

Add code
Aug 28, 2025
Viaarxiv icon

Speech Emotion Recognition via Entropy-Aware Score Selection

Add code
Aug 28, 2025
Viaarxiv icon