Picture for Zhongyuan Wang

Zhongyuan Wang

RoboTracer: Mastering Spatial Trace with Reasoning in Vision-Language Models for Robotics

Add code
Dec 15, 2025
Viaarxiv icon

PIGEON: VLM-Driven Object Navigation via Points of Interest Selection

Add code
Nov 17, 2025
Viaarxiv icon

Emu3.5: Native Multimodal Models are World Learners

Add code
Oct 30, 2025
Viaarxiv icon

Thor: Towards Human-Level Whole-Body Reactions for Intense Contact-Rich Environments

Add code
Oct 30, 2025
Viaarxiv icon

RoboOS-NeXT: A Unified Memory-based Framework for Lifelong, Scalable, and Robust Multi-Robot Collaboration

Add code
Oct 30, 2025
Viaarxiv icon

From Language to Locomotion: Retargeting-free Humanoid Control via Motion Latent Guidance

Add code
Oct 16, 2025
Viaarxiv icon

TIGeR: Tool-Integrated Geometric Reasoning in Vision-Language Models for Robotics

Add code
Oct 08, 2025
Figure 1 for TIGeR: Tool-Integrated Geometric Reasoning in Vision-Language Models for Robotics
Figure 2 for TIGeR: Tool-Integrated Geometric Reasoning in Vision-Language Models for Robotics
Figure 3 for TIGeR: Tool-Integrated Geometric Reasoning in Vision-Language Models for Robotics
Figure 4 for TIGeR: Tool-Integrated Geometric Reasoning in Vision-Language Models for Robotics
Viaarxiv icon

MathSticks: A Benchmark for Visual Symbolic Compositional Reasoning with Matchstick Puzzles

Add code
Oct 01, 2025
Viaarxiv icon

Double Helix Diffusion for Cross-Domain Anomaly Image Generation

Add code
Sep 16, 2025
Viaarxiv icon

$NavA^3$: Understanding Any Instruction, Navigating Anywhere, Finding Anything

Add code
Aug 06, 2025
Figure 1 for $NavA^3$: Understanding Any Instruction, Navigating Anywhere, Finding Anything
Figure 2 for $NavA^3$: Understanding Any Instruction, Navigating Anywhere, Finding Anything
Figure 3 for $NavA^3$: Understanding Any Instruction, Navigating Anywhere, Finding Anything
Figure 4 for $NavA^3$: Understanding Any Instruction, Navigating Anywhere, Finding Anything
Viaarxiv icon