Picture for Yueting Zhuang

Yueting Zhuang

Distilling Vision-Language Foundation Models: A Data-Free Approach via Prompt Diversification

Add code
Jul 21, 2024
Figure 1 for Distilling Vision-Language Foundation Models: A Data-Free Approach via Prompt Diversification
Figure 2 for Distilling Vision-Language Foundation Models: A Data-Free Approach via Prompt Diversification
Figure 3 for Distilling Vision-Language Foundation Models: A Data-Free Approach via Prompt Diversification
Figure 4 for Distilling Vision-Language Foundation Models: A Data-Free Approach via Prompt Diversification
Viaarxiv icon

IDEAL: Leveraging Infinite and Dynamic Characterizations of Large Language Models for Query-focused Summarization

Add code
Jul 15, 2024
Figure 1 for IDEAL: Leveraging Infinite and Dynamic Characterizations of Large Language Models for Query-focused Summarization
Figure 2 for IDEAL: Leveraging Infinite and Dynamic Characterizations of Large Language Models for Query-focused Summarization
Figure 3 for IDEAL: Leveraging Infinite and Dynamic Characterizations of Large Language Models for Query-focused Summarization
Figure 4 for IDEAL: Leveraging Infinite and Dynamic Characterizations of Large Language Models for Query-focused Summarization
Viaarxiv icon

From Easy to Hard: Learning Curricular Shape-aware Features for Robust Panoptic Scene Graph Generation

Add code
Jul 12, 2024
Viaarxiv icon

Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Model

Add code
Jul 10, 2024
Figure 1 for Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Model
Figure 2 for Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Model
Figure 3 for Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Model
Figure 4 for Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Model
Viaarxiv icon

Ask Questions with Double Hints: Visual Question Generation with Answer-awareness and Region-reference

Add code
Jul 06, 2024
Figure 1 for Ask Questions with Double Hints: Visual Question Generation with Answer-awareness and Region-reference
Figure 2 for Ask Questions with Double Hints: Visual Question Generation with Answer-awareness and Region-reference
Figure 3 for Ask Questions with Double Hints: Visual Question Generation with Answer-awareness and Region-reference
Figure 4 for Ask Questions with Double Hints: Visual Question Generation with Answer-awareness and Region-reference
Viaarxiv icon

Bridging Local Details and Global Context in Text-Attributed Graphs

Add code
Jun 18, 2024
Viaarxiv icon

Improving Large Models with Small models: Lower Costs and Better Performance

Add code
Jun 15, 2024
Viaarxiv icon

T2S-GPT: Dynamic Vector Quantization for Autoregressive Sign Language Production from Text

Add code
Jun 11, 2024
Viaarxiv icon

Stock Movement Prediction with Multimodal Stable Fusion via Gated Cross-Attention Mechanism

Add code
Jun 06, 2024
Figure 1 for Stock Movement Prediction with Multimodal Stable Fusion via Gated Cross-Attention Mechanism
Figure 2 for Stock Movement Prediction with Multimodal Stable Fusion via Gated Cross-Attention Mechanism
Figure 3 for Stock Movement Prediction with Multimodal Stable Fusion via Gated Cross-Attention Mechanism
Figure 4 for Stock Movement Prediction with Multimodal Stable Fusion via Gated Cross-Attention Mechanism
Viaarxiv icon

Auto-Encoding Morph-Tokens for Multimodal LLM

Add code
May 03, 2024
Viaarxiv icon