Picture for Haoqin Tu

Haoqin Tu

MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?

Add code
Jul 05, 2024
Viaarxiv icon

What If We Recaption Billions of Web Images with LLaMA-3?

Add code
Jun 12, 2024
Figure 1 for What If We Recaption Billions of Web Images with LLaMA-3?
Figure 2 for What If We Recaption Billions of Web Images with LLaMA-3?
Figure 3 for What If We Recaption Billions of Web Images with LLaMA-3?
Figure 4 for What If We Recaption Billions of Web Images with LLaMA-3?
Viaarxiv icon

Autoregressive Pretraining with Mamba in Vision

Add code
Jun 11, 2024
Viaarxiv icon

How Far Are We From AGI

Add code
May 16, 2024
Figure 1 for How Far Are We From AGI
Figure 2 for How Far Are We From AGI
Figure 3 for How Far Are We From AGI
Figure 4 for How Far Are We From AGI
Viaarxiv icon

Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence

Add code
Apr 10, 2024
Viaarxiv icon

Tuning LayerNorm in Attention: Towards Efficient Multi-Modal LLM Finetuning

Add code
Dec 18, 2023
Viaarxiv icon

How Many Unicorns Are in This Image? A Safety Evaluation Benchmark for Vision LLMs

Add code
Nov 27, 2023
Figure 1 for How Many Unicorns Are in This Image? A Safety Evaluation Benchmark for Vision LLMs
Figure 2 for How Many Unicorns Are in This Image? A Safety Evaluation Benchmark for Vision LLMs
Figure 3 for How Many Unicorns Are in This Image? A Safety Evaluation Benchmark for Vision LLMs
Figure 4 for How Many Unicorns Are in This Image? A Safety Evaluation Benchmark for Vision LLMs
Viaarxiv icon

Sight Beyond Text: Multi-Modal Training Enhances LLMs in Truthfulness and Ethics

Add code
Sep 13, 2023
Viaarxiv icon

ZeroGen: Zero-shot Multimodal Controllable Text Generation with Multiple Oracles

Add code
Jun 29, 2023
Figure 1 for ZeroGen: Zero-shot Multimodal Controllable Text Generation with Multiple Oracles
Figure 2 for ZeroGen: Zero-shot Multimodal Controllable Text Generation with Multiple Oracles
Figure 3 for ZeroGen: Zero-shot Multimodal Controllable Text Generation with Multiple Oracles
Figure 4 for ZeroGen: Zero-shot Multimodal Controllable Text Generation with Multiple Oracles
Viaarxiv icon

ReSee: Responding through Seeing Fine-grained Visual Knowledge in Open-domain Dialogue

Add code
May 23, 2023
Figure 1 for ReSee: Responding through Seeing Fine-grained Visual Knowledge in Open-domain Dialogue
Figure 2 for ReSee: Responding through Seeing Fine-grained Visual Knowledge in Open-domain Dialogue
Figure 3 for ReSee: Responding through Seeing Fine-grained Visual Knowledge in Open-domain Dialogue
Figure 4 for ReSee: Responding through Seeing Fine-grained Visual Knowledge in Open-domain Dialogue
Viaarxiv icon