Spoken


From Speech-to-Spatial: Grounding Utterances on A Live Shared View with Augmented Reality

Add code
Feb 03, 2026
Viaarxiv icon

Mići Princ -- A Little Boy Teaching Speech Technologies the Chakavian Dialect

Add code
Feb 03, 2026
Viaarxiv icon

PARSE: An Open-Domain Reasoning Question Answering Benchmark for Persian

Add code
Feb 01, 2026
Viaarxiv icon

EmoAra: Emotion-Preserving English Speech Transcription and Cross-Lingual Translation with Arabic Text-to-Speech

Add code
Feb 01, 2026
Viaarxiv icon

MedSpeak: A Knowledge Graph-Aided ASR Error Correction Framework for Spoken Medical QA

Add code
Feb 01, 2026
Viaarxiv icon

Edit Content, Preserve Acoustics: Imperceptible Text-Based Speech Editing via Self-Consistency Rewards

Add code
Jan 31, 2026
Viaarxiv icon

Kanade: A Simple Disentangled Tokenizer for Spoken Language Modeling

Add code
Jan 31, 2026
Viaarxiv icon

DiffuSpeech: Silent Thought, Spoken Answer via Unified Speech-Text Diffusion

Add code
Jan 30, 2026
Viaarxiv icon

Unit-Based Agent for Semi-Cascaded Full-Duplex Dialogue Systems

Add code
Jan 29, 2026
Viaarxiv icon

EditYourself: Audio-Driven Generation and Manipulation of Talking Head Videos with Diffusion Transformers

Add code
Jan 29, 2026
Viaarxiv icon