Picture for Zheng Ge

Zheng Ge

WebVR: Benchmarking Multimodal LLMs for WebPage Recreation from Videos via Human-Aligned Visual Rubrics

Add code
Mar 11, 2026
Viaarxiv icon

DM0: An Embodied-Native Vision-Language-Action Model towards Physical AI

Add code
Feb 16, 2026
Viaarxiv icon

PRIME: A Process-Outcome Alignment Benchmark for Verifiable Reasoning in Mathematics and Engineering

Add code
Feb 12, 2026
Viaarxiv icon

Step 3.5 Flash: Open Frontier-Level Intelligence with 11B Active Parameters

Add code
Feb 11, 2026
Viaarxiv icon

GEBench: Benchmarking Image Generation Models as GUI Environments

Add code
Feb 09, 2026
Viaarxiv icon

R-Align: Enhancing Generative Reward Models through Rationale-Centric Meta-Judging

Add code
Feb 06, 2026
Viaarxiv icon

STEP3-VL-10B Technical Report

Add code
Jan 15, 2026
Viaarxiv icon

PaCoRe: Learning to Scale Test-Time Compute with Parallel Coordinated Reasoning

Add code
Jan 09, 2026
Viaarxiv icon

Step-GUI Technical Report

Add code
Dec 19, 2025
Figure 1 for Step-GUI Technical Report
Figure 2 for Step-GUI Technical Report
Figure 3 for Step-GUI Technical Report
Figure 4 for Step-GUI Technical Report
Viaarxiv icon

DGAE: Diffusion-Guided Autoencoder for Efficient Latent Representation Learning

Add code
Jun 11, 2025
Viaarxiv icon