Picture for Xie Chen

Xie Chen

Read What You Hear: Reference-Free Hypotheses Evaluation with Acoustic Discrepancy

Add code
Jun 03, 2026
Viaarxiv icon

WavTTS: Towards High-Quality Zero-Shot TTS via Direct Raw Waveform Modeling

Add code
Jun 02, 2026
Viaarxiv icon

OpenSTBench: Beyond Semantic Evaluation for Speech Translation

Add code
May 29, 2026
Viaarxiv icon

Towards Human-Like Interactive Speech Recognition With Agentic Correction and Semantic Evaluation

Add code
May 28, 2026
Viaarxiv icon

Proactive for Uncertainty: Cause-Aware Error Diagnosis and Interactive Clarification for Spoken Dialogue Systems

Add code
May 25, 2026
Viaarxiv icon

Evaluating the Expressive Appropriateness of Speech in Rich Contexts

Add code
May 10, 2026
Viaarxiv icon

X-Voice: Enabling Everyone to Speak 30 Languages via Zero-Shot Cross-Lingual Voice Cloning

Add code
May 07, 2026
Viaarxiv icon

WavCube: Unifying Speech Representation for Understanding and Generation via Semantic-Acoustic Joint Modeling

Add code
May 07, 2026
Viaarxiv icon

RAS: a Reliability Oriented Metric for Automatic Speech Recognition

Add code
Apr 28, 2026
Viaarxiv icon

Less Languages, Less Tokens: An Efficient Unified Logic Cross-lingual Chain-of-Thought Reasoning Framework

Add code
Apr 22, 2026
Viaarxiv icon