Picture for Suyuan Huang

Suyuan Huang

ScalingNote: Scaling up Retrievers with Large Language Models for Real-World Dense Retrieval

Add code
Nov 24, 2024
Viaarxiv icon

Vript: A Video Is Worth Thousands of Words

Add code
Jun 10, 2024
Figure 1 for Vript: A Video Is Worth Thousands of Words
Figure 2 for Vript: A Video Is Worth Thousands of Words
Figure 3 for Vript: A Video Is Worth Thousands of Words
Figure 4 for Vript: A Video Is Worth Thousands of Words
Viaarxiv icon

From Image to Video, what do we need in multimodal LLMs?

Add code
Apr 18, 2024
Viaarxiv icon