Picture for Haoyu Zhao

Haoyu Zhao

CoV: Chain-of-View Prompting for Spatial Reasoning

Add code
Jan 08, 2026
Viaarxiv icon

See-Control: A Multimodal Agent Framework for Smartphone Interaction with a Robotic Arm

Add code
Dec 09, 2025
Viaarxiv icon

Divide, Conquer and Unite: Hierarchical Style-Recalibrated Prototype Alignment for Federated Medical Image Segmentation

Add code
Nov 14, 2025
Viaarxiv icon

AccidentBench: Benchmarking Multimodal Understanding and Reasoning in Vehicle Accidents and Beyond

Add code
Sep 30, 2025
Viaarxiv icon

pFedSAM: Personalized Federated Learning of Segment Anything Model for Medical Image Segmentation

Add code
Sep 19, 2025
Viaarxiv icon

Repeating Words for Video-Language Retrieval with Coarse-to-Fine Objectives

Add code
Aug 20, 2025
Viaarxiv icon

Towards Affordance-Aware Robotic Dexterous Grasping with Human-like Priors

Add code
Aug 12, 2025
Viaarxiv icon

ShoulderShot: Generating Over-the-Shoulder Dialogue Videos

Add code
Aug 11, 2025
Figure 1 for ShoulderShot: Generating Over-the-Shoulder Dialogue Videos
Figure 2 for ShoulderShot: Generating Over-the-Shoulder Dialogue Videos
Figure 3 for ShoulderShot: Generating Over-the-Shoulder Dialogue Videos
Figure 4 for ShoulderShot: Generating Over-the-Shoulder Dialogue Videos
Viaarxiv icon

The Role of Diversity in In-Context Learning for Large Language Models

Add code
May 26, 2025
Viaarxiv icon

SMAP: Self-supervised Motion Adaptation for Physically Plausible Humanoid Whole-body Control

Add code
May 26, 2025
Viaarxiv icon