speech


Training-Free Intelligibility-Guided Observation Addition for Noisy ASR

Add code
Feb 24, 2026
Viaarxiv icon

Quantifying Dimensional Independence in Speech: An Information-Theoretic Framework for Disentangled Representation Learning

Add code
Feb 24, 2026
Viaarxiv icon

MDM-ASR: Bridging Accuracy and Efficiency in ASR with Diffusion-Based Non-Autoregressive Decoding

Add code
Feb 24, 2026
Viaarxiv icon

Inner Speech as Behavior Guides: Steerable Imitation of Diverse Behaviors for Human-AI coordination

Add code
Feb 24, 2026
Viaarxiv icon

DECAF: Dynamic Envelope Context-Aware Fusion for Speech-Envelope Reconstruction from EEG

Add code
Feb 23, 2026
Viaarxiv icon

StyleStream: Real-Time Zero-Shot Voice Style Conversion

Add code
Feb 23, 2026
Viaarxiv icon

Cross-lingual Matryoshka Representation Learning across Speech and Text

Add code
Feb 23, 2026
Viaarxiv icon

Entropy in Large Language Models

Add code
Feb 23, 2026
Viaarxiv icon

An Approach to Combining Video and Speech with Large Language Models in Human-Robot Interaction

Add code
Feb 23, 2026
Viaarxiv icon

CTC-TTS: LLM-based dual-streaming text-to-speech with CTC alignment

Add code
Feb 23, 2026
Viaarxiv icon