speech


Geometric Second-Order Feature Correlation Learning for Self-Supervised Speech Emotion Recognition

Add code
Jun 04, 2026
Viaarxiv icon

Multilingual Multi-Speaker Unit Vocoders: A Systematic Analysis of Discrete Speech Representations

Add code
Jun 04, 2026
Viaarxiv icon

Learning Emotion-discriminative Representations for Zero-Shot Cross-lingual Speech Emotion Recognition

Add code
Jun 04, 2026
Viaarxiv icon

VoCodec: A Low-bitrate Streamable Neural Speech Codec with Voicing-driven Quantization

Add code
Jun 04, 2026
Viaarxiv icon

USAD 2.0: Scaling Representation Distillation for Universal Audio Understanding

Add code
Jun 04, 2026
Viaarxiv icon

From Self to Other: Evaluating Demographic Perspective-Taking in LLM Hate Speech Annotation

Add code
Jun 04, 2026
Viaarxiv icon

FiLM-Based Speaker Conditioning of a SpeechLLM for Pathological Speech Recognition

Add code
Jun 04, 2026
Viaarxiv icon

Revisiting Lexicon Evaluation in Unsupervised Word Discovery

Add code
Jun 04, 2026
Viaarxiv icon

ProSarc: Prosody-Aware Sarcasm Recognition Framework via Temporal Prosodic Incongruity

Add code
Jun 04, 2026
Viaarxiv icon

Multi-task Learning is Not Enough: Representational Entanglement in Dual-output Second Language Speech Recognition

Add code
Jun 04, 2026
Viaarxiv icon