Picture for Yun Fu

Yun Fu

Accessing Vision Foundation Models at ImageNet-level Costs

Add code
Jul 15, 2024
Figure 1 for Accessing Vision Foundation Models at ImageNet-level Costs
Figure 2 for Accessing Vision Foundation Models at ImageNet-level Costs
Figure 3 for Accessing Vision Foundation Models at ImageNet-level Costs
Figure 4 for Accessing Vision Foundation Models at ImageNet-level Costs
Viaarxiv icon

SoupLM: Model Integration in Large Language and Multi-Modal Models

Add code
Jul 11, 2024
Figure 1 for SoupLM: Model Integration in Large Language and Multi-Modal Models
Figure 2 for SoupLM: Model Integration in Large Language and Multi-Modal Models
Figure 3 for SoupLM: Model Integration in Large Language and Multi-Modal Models
Figure 4 for SoupLM: Model Integration in Large Language and Multi-Modal Models
Viaarxiv icon

Through the Theory of Mind's Eye: Reading Minds with Multimodal Video Large Language Models

Add code
Jun 19, 2024
Figure 1 for Through the Theory of Mind's Eye: Reading Minds with Multimodal Video Large Language Models
Figure 2 for Through the Theory of Mind's Eye: Reading Minds with Multimodal Video Large Language Models
Figure 3 for Through the Theory of Mind's Eye: Reading Minds with Multimodal Video Large Language Models
Figure 4 for Through the Theory of Mind's Eye: Reading Minds with Multimodal Video Large Language Models
Viaarxiv icon

Deciphering Movement: Unified Trajectory Generation Model for Multi-Agent

Add code
May 27, 2024
Figure 1 for Deciphering Movement: Unified Trajectory Generation Model for Multi-Agent
Figure 2 for Deciphering Movement: Unified Trajectory Generation Model for Multi-Agent
Figure 3 for Deciphering Movement: Unified Trajectory Generation Model for Multi-Agent
Figure 4 for Deciphering Movement: Unified Trajectory Generation Model for Multi-Agent
Viaarxiv icon

Consistency and Uncertainty: Identifying Unreliable Responses From Black-Box Vision-Language Models for Selective Visual Question Answering

Add code
Apr 16, 2024
Figure 1 for Consistency and Uncertainty: Identifying Unreliable Responses From Black-Box Vision-Language Models for Selective Visual Question Answering
Figure 2 for Consistency and Uncertainty: Identifying Unreliable Responses From Black-Box Vision-Language Models for Selective Visual Question Answering
Figure 3 for Consistency and Uncertainty: Identifying Unreliable Responses From Black-Box Vision-Language Models for Selective Visual Question Answering
Figure 4 for Consistency and Uncertainty: Identifying Unreliable Responses From Black-Box Vision-Language Models for Selective Visual Question Answering
Viaarxiv icon

Self-Training Large Language Models for Improved Visual Program Synthesis With Visual Reinforcement

Add code
Apr 06, 2024
Figure 1 for Self-Training Large Language Models for Improved Visual Program Synthesis With Visual Reinforcement
Figure 2 for Self-Training Large Language Models for Improved Visual Program Synthesis With Visual Reinforcement
Figure 3 for Self-Training Large Language Models for Improved Visual Program Synthesis With Visual Reinforcement
Figure 4 for Self-Training Large Language Models for Improved Visual Program Synthesis With Visual Reinforcement
Viaarxiv icon

OOSTraj: Out-of-Sight Trajectory Prediction With Vision-Positioning Denoising

Add code
Apr 02, 2024
Viaarxiv icon

Adapting to Length Shift: FlexiLength Network for Trajectory Prediction

Add code
Mar 31, 2024
Viaarxiv icon

Rewrite the Stars

Add code
Mar 29, 2024
Viaarxiv icon

Efficient Modulation for Vision Networks

Add code
Mar 29, 2024
Viaarxiv icon