speech


Driver-Intention Prediction with Deep Learning: Real-Time Brain-to-Vehicle Communication

Add code
Jan 08, 2026
Viaarxiv icon

A Unified Spoken Language Model with Injected Emotional-Attribution Thinking for Human-like Interaction

Add code
Jan 08, 2026
Viaarxiv icon

Semi-Supervised Diseased Detection from Speech Dialogues with Multi-Level Data Modeling

Add code
Jan 08, 2026
Viaarxiv icon

LLMs-Integrated Automatic Hate Speech Recognition Using Controllable Text Generation Models

Add code
Jan 08, 2026
Viaarxiv icon

SpeechMedAssist: Efficiently and Effectively Adapting Speech Language Models for Medical Consultation

Add code
Jan 08, 2026
Viaarxiv icon

WESR: Scaling and Evaluating Word-level Event-Speech Recognition

Add code
Jan 08, 2026
Viaarxiv icon

Latent-Level Enhancement with Flow Matching for Robust Automatic Speech Recognition

Add code
Jan 08, 2026
Viaarxiv icon

IndexTTS 2.5 Technical Report

Add code
Jan 08, 2026
Viaarxiv icon

TellWhisper: Tell Whisper Who Speaks When

Add code
Jan 08, 2026
Viaarxiv icon

MoE Adapter for Large Audio Language Models: Sparsity, Disentanglement, and Gradient-Conflict-Free

Add code
Jan 08, 2026
Viaarxiv icon