speech


Multi-Faceted Interactivity Alignment in Full-Duplex Speech Models

Add code
Jun 09, 2026
Viaarxiv icon

AuRA: Internalizing Audio Understanding into LLMs as LoRA

Add code
Jun 09, 2026
Viaarxiv icon

What Do Deepfake Speech Detectors Actually Hear?

Add code
Jun 09, 2026
Viaarxiv icon

Ethical and Technical Limits of Deepfake Speech Datasets

Add code
Jun 09, 2026
Viaarxiv icon

Phoneme-First Prediction for LLM-Based Speech Recognition

Add code
Jun 09, 2026
Viaarxiv icon

Speech Encoder Fusion for LLM-based Automatic Speech Recognition

Add code
Jun 09, 2026
Viaarxiv icon

Towards Deep Contextual Reasoning from Broad Descriptions for ASR with Speech-LLM via Metadata-Driven Reasoning Chains

Add code
Jun 09, 2026
Viaarxiv icon

Recovering the Zipfian Distribution in Unsupervised Term Discovery

Add code
Jun 09, 2026
Viaarxiv icon

Anchoring the Unknown: Open-Set Model Attribution via Proxy-Anchor Learning

Add code
Jun 09, 2026
Viaarxiv icon

Multilingual Word-Level Forced Alignment with Self-Supervised Representations and Learned Dynamic Programming

Add code
Jun 09, 2026
Viaarxiv icon