Picture for Shihan Wu

Shihan Wu

InSpire: Vision-Language-Action Models with Intrinsic Spatial Reasoning

Add code
May 20, 2025
Viaarxiv icon

Policy Contrastive Decoding for Robotic Foundation Models

Add code
May 19, 2025
Viaarxiv icon

Skip Tuning: Pre-trained Vision-Language Models are Effective and Efficient Adapters Themselves

Add code
Dec 16, 2024
Figure 1 for Skip Tuning: Pre-trained Vision-Language Models are Effective and Efficient Adapters Themselves
Figure 2 for Skip Tuning: Pre-trained Vision-Language Models are Effective and Efficient Adapters Themselves
Figure 3 for Skip Tuning: Pre-trained Vision-Language Models are Effective and Efficient Adapters Themselves
Figure 4 for Skip Tuning: Pre-trained Vision-Language Models are Effective and Efficient Adapters Themselves
Viaarxiv icon

DePT: Decoupled Prompt Tuning

Add code
Sep 14, 2023
Figure 1 for DePT: Decoupled Prompt Tuning
Figure 2 for DePT: Decoupled Prompt Tuning
Figure 3 for DePT: Decoupled Prompt Tuning
Figure 4 for DePT: Decoupled Prompt Tuning
Viaarxiv icon