Picture for Wei Xue

Wei Xue

ThinkSound: Chain-of-Thought Reasoning in Multimodal Large Language Models for Audio Generation and Editing

Add code
Jun 26, 2025
Viaarxiv icon

Graceful Forgetting in Generative Language Models

Add code
May 26, 2025
Viaarxiv icon

MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix

Add code
May 19, 2025
Viaarxiv icon

J1: Exploring Simple Test-Time Scaling for LLM-as-a-Judge

Add code
May 17, 2025
Viaarxiv icon

SongEval: A Benchmark Dataset for Song Aesthetics Evaluation

Add code
May 16, 2025
Viaarxiv icon

CMD: Controllable Multiview Diffusion for 3D Editing and Progressive Generation

Add code
May 11, 2025
Viaarxiv icon

Co$^{3}$Gesture: Towards Coherent Concurrent Co-speech 3D Gesture Generation with Interactive Diffusion

Add code
May 03, 2025
Viaarxiv icon

OmniAudio: Generating Spatial Audio from 360-Degree Video

Add code
Apr 21, 2025
Viaarxiv icon

AudioX: Diffusion Transformer for Anything-to-Audio Generation

Add code
Mar 13, 2025
Viaarxiv icon

YuE: Scaling Open Foundation Models for Long-Form Music Generation

Add code
Mar 11, 2025
Viaarxiv icon