speech


MANAR: Memory-augmented Attention with Navigational Abstract Conceptual Representation

Add code
Mar 19, 2026
Viaarxiv icon

On Optimizing Multimodal Jailbreaks for Spoken Language Models

Add code
Mar 19, 2026
Viaarxiv icon

Empathetic Motion Generation for Humanoid Educational Robots via Reasoning-Guided Vision--Language--Motion Diffusion Architecture

Add code
Mar 19, 2026
Viaarxiv icon

"You've got a friend in me": Co-Designing a Peer Social Robot for Young Newcomers' Language and Cultural Learning

Add code
Mar 19, 2026
Viaarxiv icon

ARTT: Augmented Reverberant-Target Training for Unsupervised Monaural Speech Dereverberation

Add code
Mar 19, 2026
Viaarxiv icon

Enhancing Multi-Corpus Training in SSL-Based Anti-Spoofing Models: Domain-Invariant Feature Extraction

Add code
Mar 19, 2026
Viaarxiv icon

DiscoPhon: Benchmarking the Unsupervised Discovery of Phoneme Inventories With Discrete Speech Units

Add code
Mar 19, 2026
Viaarxiv icon

Listen First, Then Answer: Timestamp-Grounded Speech Reasoning

Add code
Mar 19, 2026
Viaarxiv icon

EARTalking: End-to-end GPT-style Autoregressive Talking Head Synthesis with Frame-wise Control

Add code
Mar 19, 2026
Viaarxiv icon

Zipper-LoRA: Dynamic Parameter Decoupling for Speech-LLM based Multilingual Speech Recognition

Add code
Mar 19, 2026
Viaarxiv icon