speech


Towards Deep Contextual Reasoning from Broad Descriptions for ASR with Speech-LLM via Metadata-Driven Reasoning Chains

Add code
Jun 09, 2026
Viaarxiv icon

Anchoring the Unknown: Open-Set Model Attribution via Proxy-Anchor Learning

Add code
Jun 09, 2026
Viaarxiv icon

Speaker Group Encoding in Self-supervised Speech Recognition Models

Add code
Jun 09, 2026
Viaarxiv icon

Decoupling Thought from Speech: Knowledge-Grounded Counterfactual Reasoning for Resilient Multi-Agent Argumentation

Add code
Jun 09, 2026
Viaarxiv icon

Towards Robust Arabic Speech Emotion Recognition with Deep Learning

Add code
Jun 09, 2026
Viaarxiv icon

Hierarchical Policies from Verbal and Egocentric Human Signals for Natural Human-Robot Interaction

Add code
Jun 09, 2026
Viaarxiv icon

FlashTTS: Fast Streaming TTS with MTP Acceleration and X-pred Mean Flow Distillation

Add code
Jun 09, 2026
Viaarxiv icon

ParaBridge: Bridging Paralinguistic Perception and Dialogue Behavior in Speech Language Models

Add code
Jun 09, 2026
Viaarxiv icon

Multilingual Word-Level Forced Alignment with Self-Supervised Representations and Learned Dynamic Programming

Add code
Jun 09, 2026
Viaarxiv icon

Multi-Faceted Interactivity Alignment in Full-Duplex Speech Models

Add code
Jun 09, 2026
Viaarxiv icon