speech


Who is Speaking or Who is Depressed? A Controlled Study of Speaker Leakage in Speech-Based Depression Detection

Add code
Apr 15, 2026
Viaarxiv icon

Syn-TurnTurk: A Synthetic Dataset for Turn-Taking Prediction in Turkish Dialogues

Add code
Apr 15, 2026
Viaarxiv icon

FocalLens: Visualizing Narratives through Focalization

Add code
Apr 15, 2026
Viaarxiv icon

Diffusion Language Models for Speech Recognition

Add code
Apr 15, 2026
Viaarxiv icon

Character Beyond Speech: Leveraging Role-Playing Evaluation in Audio Large Language Models via Reinforcement Learning

Add code
Apr 15, 2026
Viaarxiv icon

Classical Machine Learning Baselines for Deepfake Audio Detection on the Fake-or-Real Dataset

Add code
Apr 15, 2026
Viaarxiv icon

Few-Shot and Pseudo-Label Guided Speech Quality Evaluation with Large Language Models

Add code
Apr 15, 2026
Viaarxiv icon

Contextual Biasing for ASR in Speech LLM with Common Word Cues and Bias Word Position Prediction

Add code
Apr 14, 2026
Viaarxiv icon

PS-TTS: Phonetic Synchronization in Text-to-Speech for Achieving Natural Automated Dubbing

Add code
Apr 14, 2026
Viaarxiv icon

Interactive ASR: Towards Human-Like Interaction and Semantic Coherence Evaluation for Agentic Speech Recognition

Add code
Apr 14, 2026
Viaarxiv icon