Picture for Yizhou Wang

Yizhou Wang

VLM Can Be a Good Assistant: Enhancing Embodied Visual Tracking with Self-Improving Vision-Language Models

Add code
May 28, 2025
Viaarxiv icon

Hierarchical Instruction-aware Embodied Visual Tracking

Add code
May 27, 2025
Viaarxiv icon

CMT: A Cascade MAR with Topology Predictor for Multimodal Conditional CAD Generation

Add code
Apr 29, 2025
Viaarxiv icon

Probing and Inducing Combinational Creativity in Vision-Language Models

Add code
Apr 17, 2025
Viaarxiv icon

Boosting Large Language Models with Mask Fine-Tuning

Add code
Mar 27, 2025
Viaarxiv icon

DyWA: Dynamics-adaptive World Action Model for Generalizable Non-prehensile Manipulation

Add code
Mar 21, 2025
Figure 1 for DyWA: Dynamics-adaptive World Action Model for Generalizable Non-prehensile Manipulation
Figure 2 for DyWA: Dynamics-adaptive World Action Model for Generalizable Non-prehensile Manipulation
Figure 3 for DyWA: Dynamics-adaptive World Action Model for Generalizable Non-prehensile Manipulation
Figure 4 for DyWA: Dynamics-adaptive World Action Model for Generalizable Non-prehensile Manipulation
Viaarxiv icon

EmpathyAgent: Can Embodied Agents Conduct Empathetic Actions?

Add code
Mar 19, 2025
Viaarxiv icon

Clinical Inspired MRI Lesion Segmentation

Add code
Feb 22, 2025
Figure 1 for Clinical Inspired MRI Lesion Segmentation
Figure 2 for Clinical Inspired MRI Lesion Segmentation
Figure 3 for Clinical Inspired MRI Lesion Segmentation
Figure 4 for Clinical Inspired MRI Lesion Segmentation
Viaarxiv icon

Human-Centric Foundation Models: Perception, Generation and Agentic Modeling

Add code
Feb 12, 2025
Viaarxiv icon

Acquisition through My Eyes and Steps: A Joint Predictive Agent Model in Egocentric Worlds

Add code
Feb 09, 2025
Viaarxiv icon