speech


EmoOmni: Bridging Emotional Understanding and Expression in Omni-Modal LLMs

Add code
Feb 25, 2026
Viaarxiv icon

Therapist-Robot-Patient Physical Interaction is Worth a Thousand Words: Enabling Intuitive Therapist Guidance via Remote Haptic Control

Add code
Feb 25, 2026
Viaarxiv icon

Scalable Multilingual Multimodal Machine Translation with Speech-Text Fusion

Add code
Feb 25, 2026
Viaarxiv icon

Mitigating Structural Noise in Low-Resource S2TT: An Optimized Cascaded Nepali-English Pipeline with Punctuation Restoration

Add code
Feb 25, 2026
Viaarxiv icon

Assessing the Impact of Speaker Identity in Speech Spoofing Detection

Add code
Feb 24, 2026
Viaarxiv icon

Geometric Analysis of Speech Representation Spaces: Topological Disentanglement and Confound Detection

Add code
Feb 24, 2026
Viaarxiv icon

Enhancing Hate Speech Detection on Social Media: A Comparative Analysis of Machine Learning Models and Text Transformation Approaches

Add code
Feb 24, 2026
Viaarxiv icon

Strategy-Supervised Autonomous Laparoscopic Camera Control via Event-Driven Graph Mining

Add code
Feb 24, 2026
Viaarxiv icon

Training-Free Intelligibility-Guided Observation Addition for Noisy ASR

Add code
Feb 24, 2026
Viaarxiv icon

Quantifying Dimensional Independence in Speech: An Information-Theoretic Framework for Disentangled Representation Learning

Add code
Feb 24, 2026
Viaarxiv icon