speech


Knowing What to Stress: A Discourse-Conditioned Text-to-Speech Benchmark

Add code
Apr 12, 2026
Viaarxiv icon

LASQ: A Low-resource Aspect-based Sentiment Quadruple Extraction Dataset

Add code
Apr 12, 2026
Viaarxiv icon

Cross-Cultural Bias in Mel-Scale Representations: Evidence and Alternatives from Speech and Music

Add code
Apr 12, 2026
Viaarxiv icon

BlasBench: An Open Benchmark for Irish Speech Recognition

Add code
Apr 12, 2026
Viaarxiv icon

Audio-Omni: Extending Multi-modal Understanding to Versatile Audio Generation and Editing

Add code
Apr 12, 2026
Viaarxiv icon

VidAudio-Bench: Benchmarking V2A and VT2A Generation across Four Audio Categories

Add code
Apr 12, 2026
Viaarxiv icon

Training-Free Cross-Lingual Dysarthria Severity Assessment via Phonological Subspace Analysis in Self-Supervised Speech Representations

Add code
Apr 11, 2026
Viaarxiv icon

Demographic and Linguistic Bias Evaluation in Omnimodal Language Models

Add code
Apr 11, 2026
Viaarxiv icon

Beyond Monologue: Interactive Talking-Listening Avatar Generation with Conversational Audio Context-Aware Kernels

Add code
Apr 11, 2026
Viaarxiv icon

ASPIRin: Action Space Projection for Interactivity-Optimized Reinforcement Learning in Full-Duplex Speech Language Models

Add code
Apr 11, 2026
Viaarxiv icon