Picture for Yuxuan Wang

Yuxuan Wang

Sherman

DragNeXt: Rethinking Drag-Based Image Editing

Add code
Jun 09, 2025
Viaarxiv icon

Sounding that Object: Interactive Object-Aware Image to Audio Generation

Add code
Jun 04, 2025
Viaarxiv icon

Discrete Markov Bridge

Add code
May 26, 2025
Viaarxiv icon

Towards Reliable Large Audio Language Model

Add code
May 25, 2025
Viaarxiv icon

CosyVoice 3: Towards In-the-wild Speech Generation via Scaling-up and Post-training

Add code
May 23, 2025
Viaarxiv icon

AudioMorphix: Training-free audio editing with diffusion probabilistic models

Add code
May 21, 2025
Viaarxiv icon

Leveraging Large Language Models for Command Injection Vulnerability Analysis in Python: An Empirical Study on Popular Open-Source Projects

Add code
May 21, 2025
Viaarxiv icon

Personalize Your Gaussian: Consistent 3D Scene Personalization from a Single Image

Add code
May 20, 2025
Viaarxiv icon

Seek in the Dark: Reasoning via Test-Time Instance-Level Policy Gradient in Latent Space

Add code
May 19, 2025
Viaarxiv icon

MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix

Add code
May 19, 2025
Viaarxiv icon