Picture for Xingyue Quan

Xingyue Quan

Do World Action Models Generalize Better than VLAs? A Robustness Study

Add code
Mar 23, 2026
Viaarxiv icon

ST-VLA: Enabling 4D-Aware Spatiotemporal Understanding for General Robot Manipulation

Add code
Mar 14, 2026
Viaarxiv icon

RADAR: Revealing Asymmetric Development of Abilities in MLLM Pre-training

Add code
Feb 13, 2026
Viaarxiv icon

H-WM: Robotic Task and Motion Planning Guided by Hierarchical World Model

Add code
Feb 11, 2026
Viaarxiv icon

OmniEVA: Embodied Versatile Planner via Task-Adaptive 3D-Grounded and Embodiment-aware Reasoning

Add code
Sep 11, 2025
Viaarxiv icon

GraphCoT-VLA: A 3D Spatial-Aware Reasoning Vision-Language-Action Model for Robotic Manipulation with Ambiguous Instructions

Add code
Aug 11, 2025
Viaarxiv icon

From Flatland to Space: Teaching Vision-Language Models to Perceive and Reason in 3D

Add code
Mar 29, 2025
Figure 1 for From Flatland to Space: Teaching Vision-Language Models to Perceive and Reason in 3D
Figure 2 for From Flatland to Space: Teaching Vision-Language Models to Perceive and Reason in 3D
Figure 3 for From Flatland to Space: Teaching Vision-Language Models to Perceive and Reason in 3D
Figure 4 for From Flatland to Space: Teaching Vision-Language Models to Perceive and Reason in 3D
Viaarxiv icon

Mem2Ego: Empowering Vision-Language Models with Global-to-Ego Memory for Long-Horizon Embodied Navigation

Add code
Feb 20, 2025
Figure 1 for Mem2Ego: Empowering Vision-Language Models with Global-to-Ego Memory for Long-Horizon Embodied Navigation
Figure 2 for Mem2Ego: Empowering Vision-Language Models with Global-to-Ego Memory for Long-Horizon Embodied Navigation
Figure 3 for Mem2Ego: Empowering Vision-Language Models with Global-to-Ego Memory for Long-Horizon Embodied Navigation
Figure 4 for Mem2Ego: Empowering Vision-Language Models with Global-to-Ego Memory for Long-Horizon Embodied Navigation
Viaarxiv icon

SpatialCoT: Advancing Spatial Reasoning through Coordinate Alignment and Chain-of-Thought for Embodied Task Planning

Add code
Jan 17, 2025
Figure 1 for SpatialCoT: Advancing Spatial Reasoning through Coordinate Alignment and Chain-of-Thought for Embodied Task Planning
Figure 2 for SpatialCoT: Advancing Spatial Reasoning through Coordinate Alignment and Chain-of-Thought for Embodied Task Planning
Figure 3 for SpatialCoT: Advancing Spatial Reasoning through Coordinate Alignment and Chain-of-Thought for Embodied Task Planning
Figure 4 for SpatialCoT: Advancing Spatial Reasoning through Coordinate Alignment and Chain-of-Thought for Embodied Task Planning
Viaarxiv icon

ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning

Add code
Jun 28, 2024
Figure 1 for ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning
Figure 2 for ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning
Figure 3 for ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning
Figure 4 for ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning
Viaarxiv icon