Picture for Haodong Yan

Haodong Yan

DualCoT-VLA: Visual-Linguistic Chain of Thought via Parallel Reasoning for Vision-Language-Action Models

Add code
Mar 23, 2026
Viaarxiv icon

S-VAM: Shortcut Video-Action Model by Self-Distilling Geometric and Semantic Foresight

Add code
Mar 17, 2026
Viaarxiv icon

Rethinking the Practicality of Vision-language-action Model: A Comprehensive Benchmark and An Improved Baseline

Add code
Feb 26, 2026
Viaarxiv icon

FlowVLA: Thinking in Motion with a Visual Chain of Thought

Add code
Aug 25, 2025
Viaarxiv icon

ReconVLA: Reconstructive Vision-Language-Action Model as Effective Robot Perceiver

Add code
Aug 14, 2025
Figure 1 for ReconVLA: Reconstructive Vision-Language-Action Model as Effective Robot Perceiver
Figure 2 for ReconVLA: Reconstructive Vision-Language-Action Model as Effective Robot Perceiver
Figure 3 for ReconVLA: Reconstructive Vision-Language-Action Model as Effective Robot Perceiver
Figure 4 for ReconVLA: Reconstructive Vision-Language-Action Model as Effective Robot Perceiver
Viaarxiv icon

Physically-Based Photometric Bundle Adjustment in Non-Lambertian Environments

Add code
Sep 18, 2024
Figure 1 for Physically-Based Photometric Bundle Adjustment in Non-Lambertian Environments
Figure 2 for Physically-Based Photometric Bundle Adjustment in Non-Lambertian Environments
Figure 3 for Physically-Based Photometric Bundle Adjustment in Non-Lambertian Environments
Figure 4 for Physically-Based Photometric Bundle Adjustment in Non-Lambertian Environments
Viaarxiv icon

GazeMoDiff: Gaze-guided Diffusion Model for Stochastic Human Motion Prediction

Add code
Dec 19, 2023
Figure 1 for GazeMoDiff: Gaze-guided Diffusion Model for Stochastic Human Motion Prediction
Figure 2 for GazeMoDiff: Gaze-guided Diffusion Model for Stochastic Human Motion Prediction
Figure 3 for GazeMoDiff: Gaze-guided Diffusion Model for Stochastic Human Motion Prediction
Figure 4 for GazeMoDiff: Gaze-guided Diffusion Model for Stochastic Human Motion Prediction
Viaarxiv icon