speech


From Emotion to Expression: Theoretical Foundations and Resources for Fear Speech

Add code
Jan 23, 2026
Viaarxiv icon

Timbre-Aware LLM-based Direct Speech-to-Speech Translation Extendable to Multiple Language Pairs

Add code
Jan 22, 2026
Viaarxiv icon

Artificial Rigidities vs. Biological Noise: A Comparative Analysis of Multisensory Integration in AV-HuBERT and Human Observers

Add code
Jan 22, 2026
Viaarxiv icon

Qwen3-TTS Technical Report

Add code
Jan 22, 2026
Viaarxiv icon

Sink or SWIM: Tackling Real-Time ASR at Scale

Add code
Jan 22, 2026
Viaarxiv icon

TidyVoice: A Curated Multilingual Dataset for Speaker Verification Derived from Common Voice

Add code
Jan 22, 2026
Viaarxiv icon

Transfer Learning from ImageNet for MEG-Based Decoding of Imagined Speech

Add code
Jan 22, 2026
Viaarxiv icon

DeepASMR: LLM-Based Zero-Shot ASMR Speech Generation for Anyone of Any Voice

Add code
Jan 22, 2026
Viaarxiv icon

The CMU-AIST submission for the ICME 2025 Audio Encoder Challenge

Add code
Jan 22, 2026
Viaarxiv icon

Adaptive Rotary Steering with Joint Autoregression for Robust Extraction of Closely Moving Speakers in Dynamic Scenarios

Add code
Jan 21, 2026
Viaarxiv icon