speech


Supervised Post-training of Speech Foundation Models for Robust Adaptation in Speech Deepfake Detection

Add code
Jun 24, 2026
Viaarxiv icon

Does Translation-Enhanced Speech Encoder Pre-training Affect Speech LLMs?

Add code
Jun 24, 2026
Viaarxiv icon

Real-Time Voice AI Hears but Does Not Listen

Add code
Jun 24, 2026
Viaarxiv icon

A Large-Scale Database and Predictive Model of Listener-Rated Ease of Speech Understanding in Commercial Hearing Aids

Add code
Jun 24, 2026
Viaarxiv icon

Phonetic and semantic analyses of spoken corpora of Beijing and Taiwan Mandarin indicate that the neutral tone is a lexical tone

Add code
Jun 24, 2026
Viaarxiv icon

From Sounds to Scenes: A Benchmark for Evaluating Context-Aware Auditory Scene Understanding in Large Audio Language Models

Add code
Jun 24, 2026
Viaarxiv icon

SpeechEQ: Benchmarking Emotional Intelligence Quotient in Socially Aware Voice Conversational Models

Add code
Jun 24, 2026
Viaarxiv icon

Phoneme-Level Mispronunciation Screening in Polish-Speaking Children with an Explainable Assistant

Add code
Jun 23, 2026
Viaarxiv icon

Poster: Exploring the Limits of Audio-Based Detection of Turkish Phone Call Scams

Add code
Jun 23, 2026
Viaarxiv icon

Data Scale, Not Latency, Shapes Cross-Lingual Encoder Transfer in Streaming ASR

Add code
Jun 23, 2026
Viaarxiv icon