speech


Plug-in Losses for Evidential Deep Learning: A Simplified Framework for Uncertainty Estimation that Includes the Softmax Classifier

Add code
May 21, 2026
Viaarxiv icon

Beyond Acoustic Emotion Recognition: Multimodal Pathos Analysis in Political Speech Using LLM-Based and Acoustic Emotion Models

Add code
May 21, 2026
Viaarxiv icon

In Silico Modeling of the RAMPHO Buffer: Dissociating Informational and Energetic Masking via Phonetic Entropy in Deep Neural Networks

Add code
May 21, 2026
Viaarxiv icon

Assisted Counterspeech Writing at the Crossroads of Hate Speech and Misinformation

Add code
May 21, 2026
Viaarxiv icon

Do Factual Recall Mechanisms Carry over from Text to Speech in Multimodal Language Models?

Add code
May 21, 2026
Viaarxiv icon

Effective User-defined Keyword Spotting with Dual-stage Matching, Multi-modal Enrollment, and Continual Adaptation

Add code
May 21, 2026
Viaarxiv icon

RobustSpeechFlow: Learning Robust Text-to-Speech Trajectories via Augmentation-based Contrastive Flow Matching

Add code
May 21, 2026
Viaarxiv icon

Benchmarking Commercial ASR Systems on Code-Switching Speech: Arabic, Persian, and German

Add code
May 21, 2026
Viaarxiv icon

MM-Conv: A Multimodal Dataset and Benchmark for Context-Aware Grounding in 3D Dialogue

Add code
May 20, 2026
Viaarxiv icon

Text Analytics Evaluation Framework: A Case Study on LLMs and Social Media

Add code
May 20, 2026
Viaarxiv icon