Picture for Xie Chen

Xie Chen

Towards General Discrete Speech Codec for Complex Acoustic Environments: A Study of Reconstruction and Downstream Task Consistency

Add code
May 28, 2025
Viaarxiv icon

Accelerating Flow-Matching-Based Text-to-Speech via Empirically Pruned Step Sampling

Add code
May 26, 2025
Viaarxiv icon

Towards Reliable Large Audio Language Model

Add code
May 25, 2025
Viaarxiv icon

Unlocking Temporal Flexibility: Neural Speech Codec with Variable Frame Rate

Add code
May 22, 2025
Viaarxiv icon

MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix

Add code
May 19, 2025
Viaarxiv icon

Towards Flow-Matching-based TTS without Classifier-Free Guidance

Add code
Apr 29, 2025
Viaarxiv icon

Enhancing Speech-to-Speech Dialogue Modeling with End-to-End Retrieval-Augmented Generation

Add code
Apr 27, 2025
Viaarxiv icon

SimulS2S-LLM: Unlocking Simultaneous Inference of Speech LLMs for Speech-to-Speech Translation

Add code
Apr 22, 2025
Viaarxiv icon

EmoVoice: LLM-based Emotional Text-To-Speech Model with Freestyle Text Prompting

Add code
Apr 22, 2025
Viaarxiv icon

Pseudo-Autoregressive Neural Codec Language Models for Efficient Zero-Shot Text-to-Speech Synthesis

Add code
Apr 14, 2025
Viaarxiv icon