Picture for Ziyu Guo

Ziyu Guo

VGGT-Edit: Feed-forward Native 3D Scene Editing with Residual Field Prediction

Add code
May 14, 2026
Viaarxiv icon

ATLAS: Agentic or Latent Visual Reasoning? One Word is Enough for Both

Add code
May 14, 2026
Viaarxiv icon

Uni-Synergy: Bridging Understanding and Generation for Personalized Reasoning via Co-operative Reinforcement Learning

Add code
May 11, 2026
Viaarxiv icon

Plan in Sandbox, Navigate in Open Worlds: Learning Physics-Grounded Abstracted Experience for Embodied Navigation

Add code
May 11, 2026
Viaarxiv icon

MME-CoF-Pro: Evaluating Reasoning Coherence in Video Generative Models with Text and Visual Hints

Add code
Mar 20, 2026
Viaarxiv icon

GENIUS: Generative Fluid Intelligence Evaluation Suite

Add code
Feb 11, 2026
Viaarxiv icon

Are Video Models Ready as Zero-Shot Reasoners? An Empirical Study with the MME-CoF Benchmark

Add code
Oct 30, 2025
Viaarxiv icon

Quantization Meets dLLMs: A Systematic Study of Post-training Quantization for Diffusion LLMs

Add code
Aug 20, 2025
Viaarxiv icon

Perceive Anything: Recognize, Explain, Caption, and Segment Anything in Images and Videos

Add code
Jun 05, 2025
Figure 1 for Perceive Anything: Recognize, Explain, Caption, and Segment Anything in Images and Videos
Figure 2 for Perceive Anything: Recognize, Explain, Caption, and Segment Anything in Images and Videos
Figure 3 for Perceive Anything: Recognize, Explain, Caption, and Segment Anything in Images and Videos
Figure 4 for Perceive Anything: Recognize, Explain, Caption, and Segment Anything in Images and Videos
Viaarxiv icon

Adaptive Classifier-Free Guidance via Dynamic Low-Confidence Masking

Add code
May 26, 2025
Viaarxiv icon