speech


ParaBridge: Bridging Paralinguistic Perception and Dialogue Behavior in Speech Language Models

Add code
Jun 09, 2026
Viaarxiv icon

Multilingual Word-Level Forced Alignment with Self-Supervised Representations and Learned Dynamic Programming

Add code
Jun 09, 2026
Viaarxiv icon

Multi-Faceted Interactivity Alignment in Full-Duplex Speech Models

Add code
Jun 09, 2026
Viaarxiv icon

SSL-GMMVC: Interpretable Voice Conversion via Locally Linear GMM Transforms in Self-Supervised Representation Space

Add code
Jun 09, 2026
Viaarxiv icon

Entropy-Aware Domain-Routed Mixture-of-Experts Speech-LLM Framework: A Case Study of Multi-Domain Child-Adult ASR

Add code
Jun 09, 2026
Viaarxiv icon

Enhancing Multilingual LLM-based ASR with Mixture of Experts and Dynamic Downsampling

Add code
Jun 09, 2026
Viaarxiv icon

Recovering the Zipfian Distribution in Unsupervised Term Discovery

Add code
Jun 09, 2026
Viaarxiv icon

GC-LoRA: Gated Convolutional LoRA for Parameter-Efficient Acoustic Adaptation

Add code
Jun 09, 2026
Viaarxiv icon

Speech Encoder Fusion for LLM-based Automatic Speech Recognition

Add code
Jun 09, 2026
Viaarxiv icon

Speech Meets ELF: Audio Conditional Continuous-Target Diffusion for Speech Recognition and Translation

Add code
Jun 09, 2026
Viaarxiv icon