Picture for Zhongyuan Wang

Zhongyuan Wang

From Language to Locomotion: Retargeting-free Humanoid Control via Motion Latent Guidance

Add code
Oct 16, 2025
Viaarxiv icon

TIGeR: Tool-Integrated Geometric Reasoning in Vision-Language Models for Robotics

Add code
Oct 08, 2025
Figure 1 for TIGeR: Tool-Integrated Geometric Reasoning in Vision-Language Models for Robotics
Figure 2 for TIGeR: Tool-Integrated Geometric Reasoning in Vision-Language Models for Robotics
Figure 3 for TIGeR: Tool-Integrated Geometric Reasoning in Vision-Language Models for Robotics
Figure 4 for TIGeR: Tool-Integrated Geometric Reasoning in Vision-Language Models for Robotics
Viaarxiv icon

MathSticks: A Benchmark for Visual Symbolic Compositional Reasoning with Matchstick Puzzles

Add code
Oct 01, 2025
Viaarxiv icon

Double Helix Diffusion for Cross-Domain Anomaly Image Generation

Add code
Sep 16, 2025
Viaarxiv icon

$NavA^3$: Understanding Any Instruction, Navigating Anywhere, Finding Anything

Add code
Aug 06, 2025
Figure 1 for $NavA^3$: Understanding Any Instruction, Navigating Anywhere, Finding Anything
Figure 2 for $NavA^3$: Understanding Any Instruction, Navigating Anywhere, Finding Anything
Figure 3 for $NavA^3$: Understanding Any Instruction, Navigating Anywhere, Finding Anything
Figure 4 for $NavA^3$: Understanding Any Instruction, Navigating Anywhere, Finding Anything
Viaarxiv icon

RoboBrain 2.0 Technical Report

Add code
Jul 02, 2025
Viaarxiv icon

AC-DiT: Adaptive Coordination Diffusion Transformer for Mobile Manipulation

Add code
Jul 02, 2025
Figure 1 for AC-DiT: Adaptive Coordination Diffusion Transformer for Mobile Manipulation
Figure 2 for AC-DiT: Adaptive Coordination Diffusion Transformer for Mobile Manipulation
Figure 3 for AC-DiT: Adaptive Coordination Diffusion Transformer for Mobile Manipulation
Figure 4 for AC-DiT: Adaptive Coordination Diffusion Transformer for Mobile Manipulation
Viaarxiv icon

OmniGen2: Exploration to Advanced Multimodal Generation

Add code
Jun 23, 2025
Viaarxiv icon

Video-CoT: A Comprehensive Dataset for Spatiotemporal Understanding of Videos Based on Chain-of-Thought

Add code
Jun 12, 2025
Figure 1 for Video-CoT: A Comprehensive Dataset for Spatiotemporal Understanding of Videos Based on Chain-of-Thought
Figure 2 for Video-CoT: A Comprehensive Dataset for Spatiotemporal Understanding of Videos Based on Chain-of-Thought
Figure 3 for Video-CoT: A Comprehensive Dataset for Spatiotemporal Understanding of Videos Based on Chain-of-Thought
Figure 4 for Video-CoT: A Comprehensive Dataset for Spatiotemporal Understanding of Videos Based on Chain-of-Thought
Viaarxiv icon

Towards provable probabilistic safety for scalable embodied AI systems

Add code
Jun 05, 2025
Viaarxiv icon