speech


Far from the Shallow: Brain-Predictive Reasoning Embedding through Residual Disentanglement

Add code
Oct 26, 2025
Viaarxiv icon

The Limits of Data Scaling: Sub-token Utilization and Acoustic Saturation in Multilingual ASR

Add code
Oct 26, 2025
Viaarxiv icon

Knowledge-guided Continual Learning for Behavioral Analytics Systems

Add code
Oct 25, 2025
Viaarxiv icon

FlexIO: Flexible Single- and Multi-Channel Speech Separation and Enhancement

Add code
Oct 24, 2025
Viaarxiv icon

Foley Control: Aligning a Frozen Latent Text-to-Audio Model to Video

Add code
Oct 24, 2025
Viaarxiv icon

Brain-tuning Improves Generalizability and Efficiency of Brain Alignment in Speech Models

Add code
Oct 24, 2025
Viaarxiv icon

A Scalable, Causal, and Energy Efficient Framework for Neural Decoding with Spiking Neural Networks

Add code
Oct 23, 2025
Viaarxiv icon

\textsc{CantoNLU}: A benchmark for Cantonese natural language understanding

Add code
Oct 23, 2025
Viaarxiv icon

Decoding the Ear: A Framework for Objectifying Expressiveness from Human Preference Through Efficient Alignment

Add code
Oct 23, 2025
Viaarxiv icon

UniSE: A Unified Framework for Decoder-only Autoregressive LM-based Speech Enhancement

Add code
Oct 23, 2025
Viaarxiv icon