Picture for Rongyu Zhang

Rongyu Zhang

Mask World Model: Predicting What Matters for Robust Robot Policy Learning

Add code
Apr 22, 2026
Viaarxiv icon

Key-Embedded Privacy for Decentralized AI in Biomedical Omics

Add code
Mar 30, 2026
Viaarxiv icon

FinToolSyn: A forward synthesis Framework for Financial Tool-Use Dialogue Data with Dynamic Tool Retrieval

Add code
Mar 25, 2026
Viaarxiv icon

Linking Perception, Confidence and Accuracy in MLLMs

Add code
Mar 12, 2026
Viaarxiv icon

BEVUDA++: Geometric-aware Unsupervised Domain Adaptation for Multi-View 3D Object Detection

Add code
Sep 17, 2025
Viaarxiv icon

SpikeGen: Generative Framework for Visual Spike Stream Processing

Add code
May 23, 2025
Viaarxiv icon

NTIRE 2025 challenge on Text to Image Generation Model Quality Assessment

Add code
May 22, 2025
Viaarxiv icon

MoLe-VLA: Dynamic Layer-skipping Vision Language Action Model via Mixture-of-Layers for Efficient Robot Manipulation

Add code
Mar 26, 2025
Viaarxiv icon

Second FRCSyn-onGoing: Winning Solutions and Post-Challenge Analysis to Improve Face Recognition with Synthetic Data

Add code
Dec 02, 2024
Viaarxiv icon

EVA: An Embodied World Model for Future Video Anticipation

Add code
Oct 20, 2024
Figure 1 for EVA: An Embodied World Model for Future Video Anticipation
Figure 2 for EVA: An Embodied World Model for Future Video Anticipation
Figure 3 for EVA: An Embodied World Model for Future Video Anticipation
Figure 4 for EVA: An Embodied World Model for Future Video Anticipation
Viaarxiv icon