Picture for Yueting Zhuang

Yueting Zhuang

Ask Questions with Double Hints: Visual Question Generation with Answer-awareness and Region-reference

Add code
Jul 06, 2024
Figure 1 for Ask Questions with Double Hints: Visual Question Generation with Answer-awareness and Region-reference
Figure 2 for Ask Questions with Double Hints: Visual Question Generation with Answer-awareness and Region-reference
Figure 3 for Ask Questions with Double Hints: Visual Question Generation with Answer-awareness and Region-reference
Figure 4 for Ask Questions with Double Hints: Visual Question Generation with Answer-awareness and Region-reference
Viaarxiv icon

Bridging Local Details and Global Context in Text-Attributed Graphs

Add code
Jun 18, 2024
Viaarxiv icon

Improving Large Models with Small models: Lower Costs and Better Performance

Add code
Jun 15, 2024
Viaarxiv icon

T2S-GPT: Dynamic Vector Quantization for Autoregressive Sign Language Production from Text

Add code
Jun 11, 2024
Viaarxiv icon

Stock Movement Prediction with Multimodal Stable Fusion via Gated Cross-Attention Mechanism

Add code
Jun 06, 2024
Figure 1 for Stock Movement Prediction with Multimodal Stable Fusion via Gated Cross-Attention Mechanism
Figure 2 for Stock Movement Prediction with Multimodal Stable Fusion via Gated Cross-Attention Mechanism
Figure 3 for Stock Movement Prediction with Multimodal Stable Fusion via Gated Cross-Attention Mechanism
Figure 4 for Stock Movement Prediction with Multimodal Stable Fusion via Gated Cross-Attention Mechanism
Viaarxiv icon

Auto-Encoding Morph-Tokens for Multimodal LLM

Add code
May 03, 2024
Viaarxiv icon

WorldGPT: Empowering LLM as Multimodal World Model

Add code
Apr 28, 2024
Figure 1 for WorldGPT: Empowering LLM as Multimodal World Model
Figure 2 for WorldGPT: Empowering LLM as Multimodal World Model
Figure 3 for WorldGPT: Empowering LLM as Multimodal World Model
Figure 4 for WorldGPT: Empowering LLM as Multimodal World Model
Viaarxiv icon

LASER: Tuning-Free LLM-Driven Attention Control for Efficient Text-conditioned Image-to-Animation

Add code
Apr 23, 2024
Viaarxiv icon

Fact :Teaching MLLMs with Faithful, Concise and Transferable Rationales

Add code
Apr 17, 2024
Viaarxiv icon

ProSwitch: Knowledge-Guided Language Model Fine-Tuning to Generate Professional and Non-Professional Styled Text

Add code
Mar 27, 2024
Figure 1 for ProSwitch: Knowledge-Guided Language Model Fine-Tuning to Generate Professional and Non-Professional Styled Text
Figure 2 for ProSwitch: Knowledge-Guided Language Model Fine-Tuning to Generate Professional and Non-Professional Styled Text
Figure 3 for ProSwitch: Knowledge-Guided Language Model Fine-Tuning to Generate Professional and Non-Professional Styled Text
Figure 4 for ProSwitch: Knowledge-Guided Language Model Fine-Tuning to Generate Professional and Non-Professional Styled Text
Viaarxiv icon