Picture for Siyu Zhu

Siyu Zhu

Large Depth Completion Model from Sparse Observations

Add code
May 28, 2026
Viaarxiv icon

Towards Consistent Video Geometry Estimation

Add code
May 28, 2026
Viaarxiv icon

Touch-R1: Reinforcing Touch Reasoning in MLLMs

Add code
May 26, 2026
Viaarxiv icon

ML-CLIPSim: Multi-Layer CLIP Similarity for Machine-Oriented Image Quality

Add code
May 10, 2026
Viaarxiv icon

The Thinking Pixel: Recursive Sparse Reasoning in Multimodal Diffusion Latents

Add code
Apr 28, 2026
Viaarxiv icon

Hallo-Live: Real-Time Streaming Joint Audio-Video Avatar Generation with Asynchronous Dual-Stream and Human-Centric Preference Distillation

Add code
Apr 26, 2026
Viaarxiv icon

Tora3: Trajectory-Guided Audio-Video Generation with Physical Coherence

Add code
Apr 10, 2026
Viaarxiv icon

CrowdGaussian: Reconstructing High-Fidelity 3D Gaussians for Human Crowd from a Single Image

Add code
Mar 18, 2026
Viaarxiv icon

Latent Poincaré Shaping for Agentic Reinforcement Learning

Add code
Feb 10, 2026
Viaarxiv icon

Prompt Reinjection: Alleviating Prompt Forgetting in Multimodal Diffusion Transformers

Add code
Feb 06, 2026
Viaarxiv icon