Picture for Chenliang Xu

Chenliang Xu

High-Quality Sound Separation Across Diverse Categories via Visually-Guided Generative Modeling

Add code
Sep 26, 2025
Viaarxiv icon

StreamME: Simplify 3D Gaussian Avatar within Live Stream

Add code
Jul 22, 2025
Viaarxiv icon

Can Sound Replace Vision in LLaVA With Token Substitution?

Add code
Jun 12, 2025
Viaarxiv icon

ZeroSep: Separate Anything in Audio with Zero Training

Add code
May 29, 2025
Viaarxiv icon

BinauralFlow: A Causal and Streamable Approach for High-Quality Binaural Speech Synthesis with Flow Matching Models

Add code
May 28, 2025
Figure 1 for BinauralFlow: A Causal and Streamable Approach for High-Quality Binaural Speech Synthesis with Flow Matching Models
Figure 2 for BinauralFlow: A Causal and Streamable Approach for High-Quality Binaural Speech Synthesis with Flow Matching Models
Figure 3 for BinauralFlow: A Causal and Streamable Approach for High-Quality Binaural Speech Synthesis with Flow Matching Models
Figure 4 for BinauralFlow: A Causal and Streamable Approach for High-Quality Binaural Speech Synthesis with Flow Matching Models
Viaarxiv icon

MMPerspective: Do MLLMs Understand Perspective? A Comprehensive Benchmark for Perspective Perception, Reasoning, and Robustness

Add code
May 26, 2025
Viaarxiv icon

$I^2G$: Generating Instructional Illustrations via Text-Conditioned Diffusion

Add code
May 22, 2025
Figure 1 for $I^2G$: Generating Instructional Illustrations via Text-Conditioned Diffusion
Figure 2 for $I^2G$: Generating Instructional Illustrations via Text-Conditioned Diffusion
Figure 3 for $I^2G$: Generating Instructional Illustrations via Text-Conditioned Diffusion
Figure 4 for $I^2G$: Generating Instructional Illustrations via Text-Conditioned Diffusion
Viaarxiv icon

Intentional Gesture: Deliver Your Intentions with Gestures for Speech

Add code
May 21, 2025
Viaarxiv icon

Learning to Highlight Audio by Watching Movies

Add code
May 17, 2025
Figure 1 for Learning to Highlight Audio by Watching Movies
Figure 2 for Learning to Highlight Audio by Watching Movies
Figure 3 for Learning to Highlight Audio by Watching Movies
Figure 4 for Learning to Highlight Audio by Watching Movies
Viaarxiv icon

The Sword of Damocles in ViTs: Computational Redundancy Amplifies Adversarial Transferability

Add code
Apr 15, 2025
Viaarxiv icon